Welcome

Column

Why this project

  • Get back into the books and notes to refresh on concepts and software

  • Refresh on and practice using R/RStudio

  • Experiment with flexdashboards to see if I might want/how to incorporate this as part a workflow process

  • Work with a relevant data set

  • Learning new things about flexdashboards in R/Rmarkdown (Ex: using html for picture sizing & placement)

Important Note(s)

  • Source code for this project can be found on my GitHub repo at HR_Analytics.

  • This is best viewed on a wide-screen monitor. A 24" or larger monitor is recommended.

  • After opening this file, expand or maximize the window to properly view it.

  • A small, or reduced, window size causes the top tabs to move to a second line in the header row. This collapses the page contents in a manner that hides various window/section headers, etc.

Column

Experimental section

This section demonstrates showing a code block without the result(s).

a <- 2 + 2

b <- function(x){
  print(x^2)
}

b(a)

rm(a, b) # clean memory

About The Data

Column

Data Source

IBM HR Analytics Employee Attrition & Performance
Downloaded from: https://www.kaggle.com/pavansubhasht/ibm-hr-analytics-attrition-dataset

  • Data is fictional - created by IBM data scientists

  • Insight considerations (general):

    • Predict attrition
    • What factors contribute to attrition
    • Once identify factors contributing to attrition, deep-dive and/or comparisons to develop understanding of those factors

Data Sample (scrollable)

age attrition businesstravel dailyrate department distancefromhome education educationfield employeecount employeenumber environmentsatisfaction gender hourlyrate jobinvolvement joblevel jobrole jobsatisfaction maritalstatus monthlyincome monthlyrate numcompaniesworked over18 overtime percentsalaryhike performancerating relationshipsatisfaction standardhours stockoptionlevel totalworkingyears trainingtimeslastyear worklifebalance yearsatcompany yearsincurrentrole yearssincelastpromotion yearswithcurrmanager
41 1 travel_rarely 1102 sales 1 2 life sciences 1 1 2 female 94 3 2 sales executive 4 single 5993 19479 8 y yes 11 3 1 80 0 8 0 1 6 4 0 5
49 0 travel_frequently 279 research & development 8 1 life sciences 1 2 3 male 61 2 2 research scientist 2 married 5130 24907 1 y no 23 4 4 80 1 10 3 3 10 7 1 7
37 1 travel_rarely 1373 research & development 2 2 other 1 4 4 male 92 2 1 laboratory technician 3 single 2090 2396 6 y yes 15 3 2 80 0 7 3 3 0 0 0 0
33 0 travel_frequently 1392 research & development 3 4 life sciences 1 5 4 female 56 3 1 research scientist 3 married 2909 23159 1 y yes 11 3 3 80 0 8 3 3 8 7 3 0
27 0 travel_rarely 591 research & development 2 1 medical 1 7 1 male 40 3 1 laboratory technician 2 married 3468 16632 9 y no 12 3 4 80 1 6 3 3 2 2 2 2

Column

Data Overview

value
rows 1470
columns 35
discrete_columns 8
continuous_columns 27
all_missing_columns 0
total_missing_values 0
complete_rows 1470
total_observations 51450
memory_usage 378144

Data Dictionary (scrollable)

\(attrition\)
0 ‘No’ 1 ‘Yes’

\(education\)
1 ‘Below College’ 2 ‘College’ 3 ‘Bachelor’ 4 ‘Master’ 5 ‘Doctor’

\(environmentsatisfaction\)
1 ‘Low’ 2 ‘Medium’ 3 ‘High’ 4 ‘Very High’

\(jobinvolvement\)
1 ‘Low’ 2 ‘Medium’ 3 ‘High’ 4 ‘Very High’

\(joblevel\)
Ordinal levels represented by 1, 2, 3, 4, 5. No further meaning known.

\(jobsatisfaction\)
1 ‘Low’ 2 ‘Medium’ 3 ‘High’ 4 ‘Very High’

\(performancerating\)
1 ‘Low’ 2 ‘Good’ 3 ‘Excellent’ 4 ‘Outstanding’

\(relationshipsatisfaction\)
1 ‘Low’ 2 ‘Medium’ 3 ‘High’ 4 ‘Very High’

\(stockoptionlevel\)
Ordinal levels represented by 0, 1, 2, 3. No further meaning known.

\(worklifebalance\)
1 ‘Bad’ 2 ‘Good’ 3 ‘Better’ 4 ‘Best’


factor_level meaning
businesstravel1 non-travel
businesstravel2 travel_rarely
businesstravel3 travel_frequently
department1 human resources
department2 research & development
department3 sales
educationfield1 human resources
educationfield2 life sciences
educationfield3 marketing
educationfield4 medical
educationfield5 other
educationfield6 technical degree
gender1 female
gender2 male
jobrole1 healthcare representative
jobrole2 human resources
jobrole3 laboratory technician
jobrole4 manager
jobrole5 manufacturing director
jobrole6 research director
jobrole7 research scientist
jobrole8 sales executive
jobrole9 sales representative
maritalstatus1 single
maritalstatus2 married
maritalstatus3 divorced
over18 y
overtime1 no
overtime2 yes

Missing Values

Missing Values

Feature Distributions

Column

Continuous data distributions

Discrete data distributions

Correlations

Column

All-Data Correlations

Continuous Data Correlations

Discrete Data Correlations

Initial Observations/Notes

Column

Notes

  • Using a 70/30 train/test split for assessing model performance

  • Models to explore: Logistic regression (manual and step-wise), Sparse logistic regression

  • Although, \(joblevel\) and \(stockoptionlevel\) show as numbers, they represent distinct, ordered levels. As such, we will leave the values as quantitative values but will view them from the perspective of ordinal variables during interpretations for this analysis.

  • Refer to Data Exploration \(\rightarrow\) About The Data \(\rightarrow\) Data Dictionary section to see which variables were factored and their corresponding factor levels.

  • \(maritalstatus\) was factored as an ordinal variable


Observations

  • The following variables/predictors are not needed for analysis:

    • \(employeecount\) –> each observation represent a single employee
    • \(over18\) –> all employees are over 18
    • \(standardhours\) –> has only 1 unique value (80)
    • \(employeenumber\) –> not needed; simply a means of referencing
  • There are no missing values; therefore, no imputation or removal of instances is required

  • Multicolinearity observed - high correlations involving the following continuous variables could affect model:

    • \(age\) \(\rightarrow\) \(joblevel\), \(monthlyincome\), \(totalworkingyears\) and \(yearsatcompany\)
    • \(joblevel\) \(\rightarrow\) \(monthlyincome\), \(totalworkingyears\) and \(yearsatcompany\)
    • \(monthlyincome\) \(\rightarrow\) \(totalworkingyears\) and \(yearsatcompany\)
    • \(percentsalaryhike\) \(\rightarrow\) \(performancerating\)
    • \(totalworkingyears\) \(\rightarrow\) \(yearsatcompany\)
    • \(yearsatcompany\) \(\rightarrow\) \(yearsincurrentrole\), \(yearssincelastpromotion\) and \(yearswithcurrmanager\)
  • High correlations among categorical data levels will be ignored initially. I’m choosing to do this because:

    • I am not expanding the data set to include dummy variables for each category level. Doing so will increase the dimensionality.
    • The variable selection process may resolve the issue.
  • The data is unbalanced on the response variable \(attrition\)

    Freq

    0

    1233

    1

    237

  • In his book An Introduction to Categorical Data Analysis (2nd Ed.), Agresti discusses a guideline that suggests that there “…ideally be at least 10 outcomes of each type for every predictor.” This guideline indicates that there should be no more than 23-24 predictors in our final logistic regression model.

Saturated (Full) Model

Saturated (Full) Model

\[\begin{align} logit[P(attrition = 1 (Yes))] = &\beta_0 + \beta_1age + \beta_2businesstravel + \\ &\beta_3dailyrate + \beta_4department + \beta_5distancefromhome + \beta_6education + \\ &\beta_7educationfield + \beta_8environmentsatisfaction + \beta_9gender + \beta_{10}hourlyrate + \\ &\beta_{11}jobinvolvement + \beta_{12}joblevel + \beta_{13}jobrole + \beta_{14}jobsatisfaction + \\ &\beta_{15}maritalstatus + \beta_{16}monthlyincome + \beta_{17}monthlyrate + \beta_{18}numcompaniesworked + \\ &\beta_{19}overtime + \beta_{20}percentsalaryhike + \beta_{21}performancerating + \\ &\beta_{22}relationshipsatisfaction + \beta_{23}stockoptionlevel + \\ &\beta_{24}totalworkingyears + \beta_{25}trainingtimeslastyear + \\ &\beta_{26}worklifebalance + \beta_{27}yearsatcompany + \beta_{28}yearsincurrentrole + \\ &\beta_{29}yearssincelastpromotion + \beta_{30}yearswithcurrmanager \end{align}\]

Estimate Std. Error z value Pr(>|z|)
(Intercept) -12.40085 604.81628 -0.02050 0.98364
age -0.02413 0.01627 -1.48349 0.13794
businesstraveltravel_rarely 1.31458 0.49256 2.66886 0.00761
businesstraveltravel_frequently 2.27719 0.53433 4.26177 2e-05
dailyrate -0.00038 0.00027 -1.42892 0.15303
departmentresearch & development 14.05655 604.81332 0.02324 0.98146
departmentsales 14.06676 604.81348 0.02326 0.98144
distancefromhome 0.03483 0.01299 2.68027 0.00736
education -0.04364 0.10747 -0.40609 0.68468
educationfieldlife sciences -0.18511 1.01477 -0.18241 0.85526
educationfieldmarketing 0.31086 1.06724 0.29128 0.77084
educationfieldmedical -0.20909 1.01499 -0.20600 0.83679
educationfieldother -0.49864 1.11872 -0.44572 0.6558
educationfieldtechnical degree 0.97273 1.03981 0.93549 0.34954
environmentsatisfaction -0.50942 0.10230 -4.97947 0
gendermale 0.39426 0.22275 1.76999 0.07673
hourlyrate 0.00539 0.00546 0.98715 0.32357
jobinvolvement -0.57517 0.14525 -3.95971 8e-05
joblevel -0.21105 0.38549 -0.54749 0.58404
jobrolehuman resources 16.06546 604.81381 0.02656 0.97881
jobrolelaboratory technician 1.58117 0.62439 2.53236 0.01133
jobrolemanager 0.42054 1.08537 0.38747 0.69841
jobrolemanufacturing director 0.41759 0.67944 0.61461 0.53881
jobroleresearch director -2.36486 1.42958 -1.65423 0.09808
jobroleresearch scientist 0.88143 0.62900 1.40132 0.16112
jobrolesales executive 1.12898 1.31113 0.86107 0.3892
jobrolesales representative 2.42566 1.38310 1.75379 0.07947
jobsatisfaction -0.33215 0.10019 -3.31520 0.00092
maritalstatusmarried -0.80895 0.30540 -2.64883 0.00808
maritalstatusdivorced -1.12109 0.42434 -2.64198 0.00824
monthlyincome 0.00008 0.00010 0.76904 0.44187
monthlyrate 0.00000 0.00002 0.29423 0.76858
numcompaniesworked 0.21586 0.04653 4.63894 0
overtimeyes 2.17380 0.24219 8.97541 0
percentsalaryhike -0.04615 0.04766 -0.96819 0.33295
performancerating 0.25050 0.49705 0.50397 0.61428
relationshipsatisfaction -0.24206 0.09985 -2.42427 0.01534
stockoptionlevel -0.18874 0.18727 -1.00785 0.31353
totalworkingyears -0.08158 0.03611 -2.25895 0.02389
trainingtimeslastyear -0.19218 0.08583 -2.23921 0.02514
worklifebalance -0.27068 0.15456 -1.75132 0.07989
yearsatcompany 0.12042 0.04726 2.54777 0.01084
yearsincurrentrole -0.20571 0.05674 -3.62563 0.00029
yearssincelastpromotion 0.16701 0.05117 3.26377 0.0011
yearswithcurrmanager -0.09943 0.05982 -1.66212 0.09649

Check X vs. Y Independence

Column

Check X vs. Y Independence

  • Here, we’ll check to see if there’s a relationship between (\(H_o: No\ relationship/independent)\)) each categorical variable and the response variable using contingency tables, \(\chi^2\), and \(p-values\). Where contingency tables have an ordinal variable w/ attrition (nominal - 2 levels) we will use the Cochran-Mantel-Haenszel (CMH) test, a linear trend test, since it will have more power. The results of the CMH test will be directly beneath the corresponding CrossTable and indicated by d.f. = 1.


  • Use \(\chi^2\) test for independence for the following predictors w/ attrition:
    • \(businesstravel\)
    • \(department\)
    • \(educationfield\)
    • \(gender\)
    • \(jobrole\)
    • \(maritalstatus\)
    • \(overtime\)


  • CMH test for the remaining categorical predictors vs. attrition


  • The following predictors are independent of the response (i.e. \(p-value > 0.05\)). We will remove these predictors from the model since the response does not depend on these predictors.
    • \(gender\)
    • \(relationshipsatisfaction\)
    • \(worklifebalance\)

Column

\(\chi^2\) tests (scrollable)


 
   Cell Contents
|-------------------------|
|                       N |
|              Expected N |
|-------------------------|

 
Total Observations in Table:  1029 

 
                  | attrition 
   businesstravel |         0 |         1 | Row Total | 
------------------|-----------|-----------|-----------|
       non-travel |        99 |         7 |       106 | 
                  |    88.591 |    17.409 |           | 
------------------|-----------|-----------|-----------|
    travel_rarely |       620 |       113 |       733 | 
                  |   612.614 |   120.386 |           | 
------------------|-----------|-----------|-----------|
travel_frequently |       141 |        49 |       190 | 
                  |   158.795 |    31.205 |           | 
------------------|-----------|-----------|-----------|
     Column Total |       860 |       169 |      1029 | 
------------------|-----------|-----------|-----------|

 
Statistics for All Table Factors


Pearson's Chi-squared test 
------------------------------------------------------------
Chi^2 =  20.13083     d.f. =  2     p =  4.252523e-05 


 




 
   Cell Contents
|-------------------------|
|                       N |
|              Expected N |
|-------------------------|

 
Total Observations in Table:  1029 

 
                       | attrition 
            department |         0 |         1 | Row Total | 
-----------------------|-----------|-----------|-----------|
       human resources |        33 |         8 |        41 | 
                       |    34.266 |     6.734 |           | 
-----------------------|-----------|-----------|-----------|
research & development |       583 |        91 |       674 | 
                       |   563.304 |   110.696 |           | 
-----------------------|-----------|-----------|-----------|
                 sales |       244 |        70 |       314 | 
                       |   262.430 |    51.570 |           | 
-----------------------|-----------|-----------|-----------|
          Column Total |       860 |       169 |      1029 | 
-----------------------|-----------|-----------|-----------|

 
Statistics for All Table Factors


Pearson's Chi-squared test 
------------------------------------------------------------
Chi^2 =  12.35835     d.f. =  2     p =  0.002072139 


 




 
   Cell Contents
|-------------------------|
|                       N |
|              Expected N |
|-------------------------|

 
Total Observations in Table:  1029 

 
                 | attrition 
  educationfield |         0 |         1 | Row Total | 
-----------------|-----------|-----------|-----------|
 human resources |        15 |         4 |        19 | 
                 |    15.879 |     3.121 |           | 
-----------------|-----------|-----------|-----------|
   life sciences |       360 |        63 |       423 | 
                 |   353.528 |    69.472 |           | 
-----------------|-----------|-----------|-----------|
       marketing |        94 |        27 |       121 | 
                 |   101.127 |    19.873 |           | 
-----------------|-----------|-----------|-----------|
         medical |       275 |        42 |       317 | 
                 |   264.937 |    52.063 |           | 
-----------------|-----------|-----------|-----------|
           other |        49 |         5 |        54 | 
                 |    45.131 |     8.869 |           | 
-----------------|-----------|-----------|-----------|
technical degree |        67 |        28 |        95 | 
                 |    79.397 |    15.603 |           | 
-----------------|-----------|-----------|-----------|
    Column Total |       860 |       169 |      1029 | 
-----------------|-----------|-----------|-----------|

 
Statistics for All Table Factors


Pearson's Chi-squared test 
------------------------------------------------------------
Chi^2 =  20.20982     d.f. =  5     p =  0.001141335 


 




 
   Cell Contents
|-------------------------|
|                       N |
|              Expected N |
|-------------------------|

 
Total Observations in Table:  1029 

 
             | attrition 
      gender |         0 |         1 | Row Total | 
-------------|-----------|-----------|-----------|
      female |       359 |        63 |       422 | 
             |   352.692 |    69.308 |           | 
-------------|-----------|-----------|-----------|
        male |       501 |       106 |       607 | 
             |   507.308 |    99.692 |           | 
-------------|-----------|-----------|-----------|
Column Total |       860 |       169 |      1029 | 
-------------|-----------|-----------|-----------|

 
Statistics for All Table Factors


Pearson's Chi-squared test 
------------------------------------------------------------
Chi^2 =  1.164534     d.f. =  1     p =  0.2805271 

Pearson's Chi-squared test with Yates' continuity correction 
------------------------------------------------------------
Chi^2 =  0.9872404     d.f. =  1     p =  0.3204178 

 




 
   Cell Contents
|-------------------------|
|                       N |
|              Expected N |
|-------------------------|

 
Total Observations in Table:  1029 

 
                          | attrition 
                  jobrole |         0 |         1 | Row Total | 
--------------------------|-----------|-----------|-----------|
healthcare representative |        83 |         5 |        88 | 
                          |    73.547 |    14.453 |           | 
--------------------------|-----------|-----------|-----------|
          human resources |        22 |         8 |        30 | 
                          |    25.073 |     4.927 |           | 
--------------------------|-----------|-----------|-----------|
    laboratory technician |       137 |        37 |       174 | 
                          |   145.423 |    28.577 |           | 
--------------------------|-----------|-----------|-----------|
                  manager |        74 |         4 |        78 | 
                          |    65.190 |    12.810 |           | 
--------------------------|-----------|-----------|-----------|
   manufacturing director |        84 |         7 |        91 | 
                          |    76.054 |    14.946 |           | 
--------------------------|-----------|-----------|-----------|
        research director |        60 |         1 |        61 | 
                          |    50.982 |    10.018 |           | 
--------------------------|-----------|-----------|-----------|
       research scientist |       182 |        39 |       221 | 
                          |   184.704 |    36.296 |           | 
--------------------------|-----------|-----------|-----------|
          sales executive |       183 |        43 |       226 | 
                          |   188.882 |    37.118 |           | 
--------------------------|-----------|-----------|-----------|
     sales representative |        35 |        25 |        60 | 
                          |    50.146 |     9.854 |           | 
--------------------------|-----------|-----------|-----------|
             Column Total |       860 |       169 |      1029 | 
--------------------------|-----------|-----------|-----------|

 
Statistics for All Table Factors


Pearson's Chi-squared test 
------------------------------------------------------------
Chi^2 =  63.88879     d.f. =  8     p =  8.000966e-11 


 




 
   Cell Contents
|-------------------------|
|                       N |
|              Expected N |
|-------------------------|

 
Total Observations in Table:  1029 

 
              | attrition 
maritalstatus |         0 |         1 | Row Total | 
--------------|-----------|-----------|-----------|
       single |       245 |        84 |       329 | 
              |   274.966 |    54.034 |           | 
--------------|-----------|-----------|-----------|
      married |       405 |        63 |       468 | 
              |   391.137 |    76.863 |           | 
--------------|-----------|-----------|-----------|
     divorced |       210 |        22 |       232 | 
              |   193.897 |    38.103 |           | 
--------------|-----------|-----------|-----------|
 Column Total |       860 |       169 |      1029 | 
--------------|-----------|-----------|-----------|

 
Statistics for All Table Factors


Pearson's Chi-squared test 
------------------------------------------------------------
Chi^2 =  31.01857     d.f. =  2     p =  1.838245e-07 


 




 
   Cell Contents
|-------------------------|
|                       N |
|              Expected N |
|-------------------------|

 
Total Observations in Table:  1029 

 
             | attrition 
    overtime |         0 |         1 | Row Total | 
-------------|-----------|-----------|-----------|
          no |       664 |        78 |       742 | 
             |   620.136 |   121.864 |           | 
-------------|-----------|-----------|-----------|
         yes |       196 |        91 |       287 | 
             |   239.864 |    47.136 |           | 
-------------|-----------|-----------|-----------|
Column Total |       860 |       169 |      1029 | 
-------------|-----------|-----------|-----------|

 
Statistics for All Table Factors


Pearson's Chi-squared test 
------------------------------------------------------------
Chi^2 =  67.73148     d.f. =  1     p =  1.873485e-16 

Pearson's Chi-squared test with Yates' continuity correction 
------------------------------------------------------------
Chi^2 =  66.19615     d.f. =  1     p =  4.082086e-16 

 

CMH tests (scrollable)


 
   Cell Contents
|-------------------------|
|                       N |
|              Expected N |
|-------------------------|

 
Total Observations in Table:  1029 

 
             | attrition 
   education |         0 |         1 | Row Total | 
-------------|-----------|-----------|-----------|
           1 |        96 |        25 |       121 | 
             |       101 |        20 |           | 
-------------|-----------|-----------|-----------|
           2 |       160 |        33 |       193 | 
             |       161 |        32 |           | 
-------------|-----------|-----------|-----------|
           3 |       341 |        73 |       414 | 
             |       346 |        68 |           | 
-------------|-----------|-----------|-----------|
           4 |       234 |        35 |       269 | 
             |       225 |        44 |           | 
-------------|-----------|-----------|-----------|
           5 |        29 |         3 |        32 | 
             |        27 |         5 |           | 
-------------|-----------|-----------|-----------|
Column Total |       860 |       169 |      1029 | 
-------------|-----------|-----------|-----------|

 
Statistics for All Table Factors


Pearson's Chi-squared test 
------------------------------------------------------------
Chi^2 =  5.528328     d.f. =  4     p =  0.2372506 


 
     Chisq         Df       Prob 
4.36088337 1.00000000 0.03677323 




 
   Cell Contents
|-------------------------|
|                       N |
|              Expected N |
|-------------------------|

 
Total Observations in Table:  1029 

 
                        | attrition 
environmentsatisfaction |         0 |         1 | Row Total | 
------------------------|-----------|-----------|-----------|
                      1 |       144 |        55 |       199 | 
                        |       166 |        33 |           | 
------------------------|-----------|-----------|-----------|
                      2 |       190 |        30 |       220 | 
                        |       184 |        36 |           | 
------------------------|-----------|-----------|-----------|
                      3 |       262 |        48 |       310 | 
                        |       259 |        51 |           | 
------------------------|-----------|-----------|-----------|
                      4 |       264 |        36 |       300 | 
                        |       251 |        49 |           | 
------------------------|-----------|-----------|-----------|
           Column Total |       860 |       169 |      1029 | 
------------------------|-----------|-----------|-----------|

 
Statistics for All Table Factors


Pearson's Chi-squared test 
------------------------------------------------------------
Chi^2 =  23.95468     d.f. =  3     p =  2.553014e-05 


 
       Chisq           Df         Prob 
1.602041e+01 1.000000e+00 6.266329e-05 




 
   Cell Contents
|-------------------------|
|                       N |
|              Expected N |
|-------------------------|

 
Total Observations in Table:  1029 

 
               | attrition 
jobinvolvement |         0 |         1 | Row Total | 
---------------|-----------|-----------|-----------|
             1 |        43 |        23 |        66 | 
               |        55 |        11 |           | 
---------------|-----------|-----------|-----------|
             2 |       212 |        48 |       260 | 
               |       217 |        43 |           | 
---------------|-----------|-----------|-----------|
             3 |       513 |        89 |       602 | 
               |       503 |        99 |           | 
---------------|-----------|-----------|-----------|
             4 |        92 |         9 |       101 | 
               |        84 |        17 |           | 
---------------|-----------|-----------|-----------|
  Column Total |       860 |       169 |      1029 | 
---------------|-----------|-----------|-----------|

 
Statistics for All Table Factors


Pearson's Chi-squared test 
------------------------------------------------------------
Chi^2 =  22.44157     d.f. =  3     p =  5.278864e-05 


 
       Chisq           Df         Prob 
1.856557e+01 1.000000e+00 1.641587e-05 




 
   Cell Contents
|-------------------------|
|                       N |
|              Expected N |
|-------------------------|

 
Total Observations in Table:  1029 

 
             | attrition 
    joblevel |         0 |         1 | Row Total | 
-------------|-----------|-----------|-----------|
           1 |       284 |       102 |       386 | 
             |       323 |        63 |           | 
-------------|-----------|-----------|-----------|
           2 |       327 |        40 |       367 | 
             |       307 |        60 |           | 
-------------|-----------|-----------|-----------|
           3 |       126 |        20 |       146 | 
             |       122 |        24 |           | 
-------------|-----------|-----------|-----------|
           4 |        76 |         3 |        79 | 
             |        66 |        13 |           | 
-------------|-----------|-----------|-----------|
           5 |        47 |         4 |        51 | 
             |        43 |         8 |           | 
-------------|-----------|-----------|-----------|
Column Total |       860 |       169 |      1029 | 
-------------|-----------|-----------|-----------|

 
Statistics for All Table Factors


Pearson's Chi-squared test 
------------------------------------------------------------
Chi^2 =  48.98864     d.f. =  4     p =  5.870761e-10 


 
       Chisq           Df         Prob 
3.199786e+01 1.000000e+00 1.543427e-08 




 
   Cell Contents
|-------------------------|
|                       N |
|              Expected N |
|-------------------------|

 
Total Observations in Table:  1029 

 
                | attrition 
jobsatisfaction |         0 |         1 | Row Total | 
----------------|-----------|-----------|-----------|
              1 |       156 |        41 |       197 | 
                |       165 |        32 |           | 
----------------|-----------|-----------|-----------|
              2 |       157 |        32 |       189 | 
                |       158 |        31 |           | 
----------------|-----------|-----------|-----------|
              3 |       269 |        55 |       324 | 
                |       271 |        53 |           | 
----------------|-----------|-----------|-----------|
              4 |       278 |        41 |       319 | 
                |       267 |        52 |           | 
----------------|-----------|-----------|-----------|
   Column Total |       860 |       169 |      1029 | 
----------------|-----------|-----------|-----------|

 
Statistics for All Table Factors


Pearson's Chi-squared test 
------------------------------------------------------------
Chi^2 =  5.834937     d.f. =  3     p =  0.1199229 


 
     Chisq         Df       Prob 
5.20628004 1.00000000 0.02250544 




 
   Cell Contents
|-------------------------|
|                       N |
|              Expected N |
|-------------------------|

 
Total Observations in Table:  1029 

 
                         | attrition 
relationshipsatisfaction |         0 |         1 | Row Total | 
-------------------------|-----------|-----------|-----------|
                       1 |       158 |        45 |       203 | 
                         |       170 |        33 |           | 
-------------------------|-----------|-----------|-----------|
                       2 |       180 |        30 |       210 | 
                         |       176 |        34 |           | 
-------------------------|-----------|-----------|-----------|
                       3 |       277 |        47 |       324 | 
                         |       271 |        53 |           | 
-------------------------|-----------|-----------|-----------|
                       4 |       245 |        47 |       292 | 
                         |       244 |        48 |           | 
-------------------------|-----------|-----------|-----------|
            Column Total |       860 |       169 |      1029 | 
-------------------------|-----------|-----------|-----------|

 
Statistics for All Table Factors


Pearson's Chi-squared test 
------------------------------------------------------------
Chi^2 =  6.46917     d.f. =  3     p =  0.09088631 


 
    Chisq        Df      Prob 
2.3512264 1.0000000 0.1251845 




 
   Cell Contents
|-------------------------|
|                       N |
|              Expected N |
|-------------------------|

 
Total Observations in Table:  1029 

 
                 | attrition 
stockoptionlevel |         0 |         1 | Row Total | 
-----------------|-----------|-----------|-----------|
               0 |       332 |       109 |       441 | 
                 |       369 |        72 |           | 
-----------------|-----------|-----------|-----------|
               1 |       375 |        41 |       416 | 
                 |       348 |        68 |           | 
-----------------|-----------|-----------|-----------|
               2 |       104 |         9 |       113 | 
                 |        94 |        19 |           | 
-----------------|-----------|-----------|-----------|
               3 |        49 |        10 |        59 | 
                 |        49 |        10 |           | 
-----------------|-----------|-----------|-----------|
    Column Total |       860 |       169 |      1029 | 
-----------------|-----------|-----------|-----------|

 
Statistics for All Table Factors


Pearson's Chi-squared test 
------------------------------------------------------------
Chi^2 =  41.07117     d.f. =  3     p =  6.315821e-09 


 
       Chisq           Df         Prob 
2.017610e+01 1.000000e+00 7.062985e-06 




 
   Cell Contents
|-------------------------|
|                       N |
|              Expected N |
|-------------------------|

 
Total Observations in Table:  1029 

 
                | attrition 
worklifebalance |         0 |         1 | Row Total | 
----------------|-----------|-----------|-----------|
              1 |        36 |        16 |        52 | 
                |        43 |         9 |           | 
----------------|-----------|-----------|-----------|
              2 |       199 |        41 |       240 | 
                |       201 |        39 |           | 
----------------|-----------|-----------|-----------|
              3 |       541 |        93 |       634 | 
                |       530 |       104 |           | 
----------------|-----------|-----------|-----------|
              4 |        84 |        19 |       103 | 
                |        86 |        17 |           | 
----------------|-----------|-----------|-----------|
   Column Total |       860 |       169 |      1029 | 
----------------|-----------|-----------|-----------|

 
Statistics for All Table Factors


Pearson's Chi-squared test 
------------------------------------------------------------
Chi^2 =  9.601838     d.f. =  3     p =  0.02227229 


 
     Chisq         Df       Prob 
3.05963351 1.00000000 0.08025977 

Check VIFs

Model m2 showing the removal of the three predictors from the previous section

\[\begin{align} logit[P(attrition = 1 (Yes))] = &\beta_0 + \beta_1age + \beta_2businesstravel + \\ &\beta_3dailyrate + \beta_4department + \beta_5distancefromhome + \beta_6education + \\ &\beta_7educationfield + \beta_8environmentsatisfaction + \beta_{10}hourlyrate + \\ &\beta_{11}jobinvolvement + \beta_{12}joblevel + \beta_{13}jobrole + \beta_{14}jobsatisfaction + \\ &\beta_{15}maritalstatus + \beta_{16}monthlyincome + \beta_{17}monthlyrate + \beta_{18}numcompaniesworked + \\ &\beta_{19}overtime + \beta_{20}percentsalaryhike + \beta_{21}performancerating + \\ &\beta_{23}stockoptionlevel + \beta_{24}totalworkingyears + \beta_{25}trainingtimeslastyear + \\ &\beta_{27}yearsatcompany + \beta_{28}yearsincurrentrole + \\ &\beta_{29}yearssincelastpromotion + \beta_{30}yearswithcurrmanager \end{align}\]

GVIF Df GVIF^(1/(2*Df))
age 1.840417e+00 1 1.356620
businesstravel 1.184785e+00 2 1.043302
dailyrate 1.066514e+00 1 1.032722
department 4.323419e+07 2 81.088047
distancefromhome 1.096092e+00 1 1.046944
education 1.116183e+00 1 1.056496
educationfield 3.469739e+00 5 1.132478
environmentsatisfaction 1.126510e+00 1 1.061371
hourlyrate 1.056816e+00 1 1.028016
jobinvolvement 1.092947e+00 1 1.045441
joblevel 1.068985e+01 1 3.269534
jobrole 2.961597e+08 8 3.384312
jobsatisfaction 1.118621e+00 1 1.057649
maritalstatus 2.274122e+00 2 1.228014
monthlyincome 1.108998e+01 1 3.330161
monthlyrate 1.079802e+00 1 1.039135
numcompaniesworked 1.380134e+00 1 1.174791
overtime 1.264672e+00 1 1.124576
percentsalaryhike 2.756273e+00 1 1.660203
performancerating 2.712538e+00 1 1.646978
stockoptionlevel 2.093548e+00 1 1.446910
totalworkingyears 4.932473e+00 1 2.220917
trainingtimeslastyear 1.054267e+00 1 1.026775
yearsatcompany 6.151852e+00 1 2.480293
yearsincurrentrole 2.695909e+00 1 1.641922
yearssincelastpromotion 2.319255e+00 1 1.522910
yearswithcurrmanager 3.091729e+00 1 1.758331

Reduced Model (m2)

Column

Reduced model m2

  • Now that we’ve identified an initial set of variables to remove, we arrive at a reduced model (m2) in the form of:

\[\begin{align} logit[P(attrition = 1 (Yes))] = &\beta_0 + \beta_1age + \beta_2businesstravel + \\ &\beta_3dailyrate + \beta_5distancefromhome + \beta_6education + \\ &\beta_7educationfield + \beta_8environmentsatisfaction + \beta_{10}hourlyrate + \\ &\beta_{11}jobinvolvement + \beta_{14}jobsatisfaction + \\ &\beta_{15}maritalstatus + \beta_{17}monthlyrate + \beta_{18}numcompaniesworked + \\ &\beta_{19}overtime + \beta_{20}percentsalaryhike + \beta_{21}performancerating + \\ &\beta_{23}stockoptionlevel + \beta_{24}totalworkingyears + \beta_{25}trainingtimeslastyear + \\ &\beta_{28}yearsincurrentrole + \\ &\beta_{29}yearssincelastpromotion + \beta_{30}yearswithcurrmanager \end{align}\]

  • Does the model (m2) fit?
Estimate Std. Error z value Pr(>|z|)
(Intercept) 2.89762 1.59473 1.81700 0.06922
age -0.02393 0.01548 -1.54582 0.12215
businesstraveltravel_rarely 1.32695 0.47871 2.77193 0.00557
businesstraveltravel_frequently 2.21691 0.51385 4.31433 2e-05
dailyrate -0.00027 0.00026 -1.06635 0.28626
distancefromhome 0.03094 0.01229 2.51831 0.01179
education -0.10312 0.10155 -1.01549 0.30987
educationfieldlife sciences -0.87567 0.73925 -1.18454 0.2362
educationfieldmarketing -0.06232 0.76849 -0.08109 0.93537
educationfieldmedical -0.99789 0.74980 -1.33088 0.18323
educationfieldother -1.26278 0.89331 -1.41360 0.15748
educationfieldtechnical degree 0.26081 0.77526 0.33641 0.73656
environmentsatisfaction -0.50315 0.09647 -5.21562 0
hourlyrate 0.00348 0.00511 0.68159 0.4955
jobinvolvement -0.64914 0.14001 -4.63634 0
jobsatisfaction -0.27984 0.09371 -2.98619 0.00282
maritalstatusmarried -0.70602 0.28583 -2.47005 0.01351
maritalstatusdivorced -1.05249 0.40122 -2.62320 0.00871
monthlyrate 0.00000 0.00001 0.08527 0.93205
numcompaniesworked 0.17224 0.04265 4.03846 5e-05
overtimeyes 1.83452 0.21577 8.50199 0
percentsalaryhike -0.02972 0.04459 -0.66642 0.50515
performancerating 0.21818 0.47009 0.46412 0.64256
stockoptionlevel -0.13817 0.17670 -0.78195 0.43425
totalworkingyears -0.08632 0.02514 -3.43405 0.00059
trainingtimeslastyear -0.17493 0.08120 -2.15439 0.03121
yearsincurrentrole -0.13663 0.05123 -2.66700 0.00765
yearssincelastpromotion 0.17277 0.04674 3.69683 0.00022
yearswithcurrmanager -0.02938 0.05232 -0.56151 0.57445
Residual.Deviance Residual.df
642.7284 1000
  • We see that model M2 fits by \(\frac{deviance_{res}}{df_{res}} \leq 1\). Check the marginal model plots to check model validity and to see if any of the continuous predictors are misspecified. If a predictor is misspecified, check the conditional density plots to see what kind of transformation may be needed.

Column

Marginal Model Plots (scrollable)

  • Check the marginal model plots (mmp) of the quantitative predictors to see if the model is specified correctly

Conditional Density Plots (scrollable)

  • \(age\) appears to be misspecified.

  • \(age\) appears to generally have a normal distribution and the same/similar variance for both values of \(attrition\) (i.e., yes and no). Let’s try adding a quadratic term to the model for \(age\).

Reduced Model (m3)

Column

Model m3

  • Here we will include a quadratic term in the model for \(age\) and arrive at model (m3) in the form of:

\[\begin{align} logit[P(attrition = 1 (Yes))] = &\beta_0 + \beta_1age + \beta_{1a}age^2 + \beta_2businesstravel + \\ &\beta_3dailyrate + \beta_5distancefromhome + \beta_6education + \\ &\beta_7educationfield + \beta_8environmentsatisfaction + \beta_{10}hourlyrate + \\ &\beta_{11}jobinvolvement + \beta_{14}jobsatisfaction + \\ &\beta_{15}maritalstatus + \beta_{17}monthlyrate + \beta_{18}numcompaniesworked + \\ &\beta_{19}overtime + \beta_{20}percentsalaryhike + \beta_{21}performancerating + \\ &\beta_{23}stockoptionlevel + \beta_{24}totalworkingyears + \beta_{25}trainingtimeslastyear + \\ &\beta_{28}yearsincurrentrole + \\ &\beta_{29}yearssincelastpromotion + \beta_{30}yearswithcurrmanager \end{align}\]

  • Does the model (m3) fit?
Estimate Std. Error z value Pr(>|z|)
(Intercept) 7.29298 2.07533 3.51412 0.00044
age -0.28164 0.07867 -3.58011 0.00034
I(age^2) 0.00338 0.00101 3.36209 0.00077
businesstraveltravel_rarely 1.29460 0.48524 2.66794 0.00763
businesstraveltravel_frequently 2.21435 0.52003 4.25815 2e-05
dailyrate -0.00031 0.00026 -1.18928 0.23433
distancefromhome 0.03005 0.01236 2.43047 0.01508
education -0.04823 0.10426 -0.46265 0.64361
educationfieldlife sciences -0.92361 0.74617 -1.23780 0.21579
educationfieldmarketing -0.08924 0.77513 -0.11513 0.90834
educationfieldmedical -1.03216 0.75642 -1.36453 0.1724
educationfieldother -1.28829 0.90164 -1.42884 0.15305
educationfieldtechnical degree 0.15087 0.78271 0.19275 0.84715
environmentsatisfaction -0.50753 0.09726 -5.21854 0
hourlyrate 0.00382 0.00515 0.74058 0.45895
jobinvolvement -0.65153 0.14125 -4.61264 0
jobsatisfaction -0.27098 0.09469 -2.86171 0.00421
maritalstatusmarried -0.65875 0.28968 -2.27407 0.02296
maritalstatusdivorced -0.97599 0.40609 -2.40337 0.01624
monthlyrate 0.00000 0.00001 -0.09178 0.92687
numcompaniesworked 0.17913 0.04298 4.16796 3e-05
overtimeyes 1.84479 0.21882 8.43059 0
percentsalaryhike -0.03638 0.04525 -0.80396 0.42142
performancerating 0.26208 0.47515 0.55158 0.58123
stockoptionlevel -0.13170 0.17904 -0.73558 0.46199
totalworkingyears -0.08600 0.02482 -3.46514 0.00053
trainingtimeslastyear -0.17986 0.08211 -2.19050 0.02849
yearsincurrentrole -0.12795 0.05153 -2.48310 0.01302
yearssincelastpromotion 0.16528 0.04672 3.53777 4e-04
yearswithcurrmanager -0.01267 0.05313 -0.23840 0.81157
Residual.Deviance Residual.df
631.7444 999
  • We see that model m3 fits by \(\frac{deviance_{res}}{df_{res}} \leq 1\). Check the marginal model plots.

Column

Marginal Model Plots (scrollable)

  • Check the marginal model plots (mmp) of the quantitative predictors to see if the model is specified correctly

Observations on model m3

  • Adding the \(age^2\) term corrects some misspecification in the model.

  • Look at the standardized deviance residuals and inspect for outliers and bad leverage points

Leverage (Model m3)

Column

Leverage Plot

Observations

  • There doesn’t appear to be any bad leverage points, although there are several points of high leverage.

  • There does appear to be outliers in the data.

  • Since this is a simulated dataset, we will assume that there is sufficient reason for removing the outliers.

Column

Number of outliers

34

Outlier indices

6, 11, 17, 38, 47, 62, 68, 82, 154, 184, 204, 264, 309, 321, 352, 364, 390, 558, 561, 606, 609, 660, 668, 680, 685, 693, 720, 730, 755, 786, 817, 837, 876 and 910

Outlier data (scrollable)

age attrition businesstravel dailyrate department distancefromhome education educationfield environmentsatisfaction gender hourlyrate jobinvolvement joblevel jobrole jobsatisfaction maritalstatus monthlyincome monthlyrate numcompaniesworked overtime percentsalaryhike performancerating relationshipsatisfaction stockoptionlevel totalworkingyears trainingtimeslastyear worklifebalance yearsatcompany yearsincurrentrole yearssincelastpromotion yearswithcurrmanager
28 1 travel_rarely 890 research & development 2 4 medical 3 male 46 3 1 research scientist 3 single 4382 16374 6 no 17 3 4 0 5 3 2 2 2 2 1
44 1 travel_rarely 935 research & development 3 3 life sciences 1 male 89 3 1 laboratory technician 1 married 2362 14669 4 no 12 3 3 0 10 4 4 3 2 1 2
32 1 non-travel 1474 sales 11 4 other 4 male 60 4 2 sales executive 3 married 4707 23914 8 no 12 3 4 0 6 2 3 4 2 1 2
39 1 travel_rarely 1162 sales 3 2 medical 4 female 41 3 2 sales executive 3 married 5238 17778 4 yes 18 3 1 0 12 3 2 1 0 0 0
39 1 travel_rarely 360 research & development 23 3 medical 3 male 93 3 1 research scientist 1 single 3904 22154 0 no 13 3 1 0 6 2 3 5 2 0 3
36 1 travel_rarely 530 sales 3 1 life sciences 3 male 51 2 3 sales executive 4 married 10325 5518 1 yes 11 3 1 1 16 6 3 16 7 3 7
27 1 travel_rarely 1420 sales 2 1 marketing 3 male 85 3 1 sales representative 1 divorced 3041 16346 0 no 11 3 2 1 5 3 3 4 3 0 2
21 1 travel_rarely 1427 research & development 18 1 other 4 female 65 3 1 research scientist 4 single 2693 8870 1 no 19 3 1 0 1 3 2 1 0 0 0
53 1 travel_rarely 607 research & development 2 5 technical degree 3 female 78 2 3 manufacturing director 4 married 10169 14618 0 no 16 3 2 1 34 4 3 33 7 1 9
44 1 travel_rarely 1376 human resources 1 2 medical 2 male 91 2 3 human resources 1 married 10482 2326 9 no 14 3 4 1 24 1 3 20 6 3 6
35 1 travel_frequently 880 sales 12 4 other 4 male 36 3 2 sales executive 4 single 4581 10414 3 yes 24 4 1 0 13 2 4 11 9 6 7
46 1 travel_rarely 669 sales 9 2 medical 3 male 64 2 3 sales executive 4 single 9619 13596 1 no 16 3 4 0 9 3 3 9 8 4 7
39 1 travel_frequently 203 research & development 2 3 life sciences 1 male 84 3 4 healthcare representative 4 divorced 12169 13547 7 no 11 3 4 3 21 4 3 18 7 11 5
38 1 travel_rarely 903 research & development 2 3 medical 3 male 81 3 2 manufacturing director 2 married 4855 7653 4 no 11 3 1 2 7 2 3 5 2 1 4
44 1 travel_rarely 621 research & development 15 3 medical 1 female 73 3 3 healthcare representative 4 married 7978 14075 1 no 11 3 4 1 10 2 3 10 7 0 5
34 1 travel_frequently 234 research & development 9 4 life sciences 4 male 93 3 2 laboratory technician 1 married 5346 6208 4 no 17 3 3 1 11 3 2 7 1 0 7
30 1 travel_rarely 740 sales 1 3 life sciences 2 male 64 2 2 sales executive 1 married 9714 5323 1 no 11 3 4 1 10 4 3 10 8 6 7
52 1 travel_rarely 723 research & development 8 4 medical 3 male 85 2 2 research scientist 2 married 4941 17747 2 no 15 3 1 0 11 3 2 8 2 7 7
48 1 travel_frequently 708 sales 7 2 medical 4 female 95 3 1 sales representative 3 married 2655 11740 2 yes 11 3 3 2 19 3 3 9 7 7 7
58 1 travel_rarely 147 research & development 23 4 medical 4 female 94 3 3 healthcare representative 4 married 10312 3465 1 no 12 3 4 1 40 3 2 40 10 15 6
31 1 travel_frequently 1445 research & development 1 5 life sciences 3 female 100 4 3 manufacturing director 2 single 7446 8931 1 no 11 3 1 0 10 2 3 10 8 4 7
52 1 travel_rarely 266 sales 2 1 marketing 1 female 57 1 5 manager 4 married 19845 25846 1 no 15 3 4 1 33 3 3 32 14 6 9
41 1 non-travel 906 research & development 5 2 life sciences 1 male 95 2 1 research scientist 1 divorced 2107 20293 6 no 17 3 1 1 5 2 1 1 0 0 0
41 1 travel_rarely 1360 research & development 12 3 technical degree 2 female 49 3 5 research director 3 married 19545 16280 1 no 12 3 4 0 23 0 3 22 15 15 8
46 1 travel_rarely 377 sales 9 3 marketing 1 male 52 3 3 sales executive 4 divorced 10096 15986 4 no 11 3 1 1 28 1 4 7 7 4 3
46 1 travel_rarely 1254 sales 10 3 life sciences 3 female 64 3 3 sales executive 2 married 7314 14011 5 no 21 4 3 3 14 2 3 8 7 0 7
49 1 travel_rarely 1184 sales 11 3 marketing 3 female 43 3 3 sales executive 4 married 7654 5860 1 no 18 3 1 2 9 3 4 9 8 7 7
39 1 travel_rarely 895 sales 5 3 technical degree 4 male 56 3 2 sales representative 4 married 2086 3335 3 no 14 3 3 1 19 6 4 1 0 0 0
40 1 non-travel 1479 sales 24 3 life sciences 2 female 100 4 4 sales executive 2 single 13194 17071 4 yes 16 3 4 0 22 2 2 1 0 0 0
33 1 travel_rarely 465 research & development 2 2 life sciences 1 female 39 3 1 laboratory technician 1 married 2707 21509 7 no 20 4 1 0 13 3 4 9 7 1 7
44 1 travel_frequently 920 research & development 24 3 life sciences 4 male 43 3 1 laboratory technician 3 divorced 3161 19920 3 yes 22 4 4 1 19 0 1 1 0 0 0
55 1 travel_rarely 725 research & development 2 3 medical 4 male 78 3 5 manager 1 married 19859 21199 5 yes 13 3 4 1 24 2 3 5 2 1 4
30 1 travel_rarely 945 sales 9 3 medical 2 male 89 3 1 sales representative 4 single 1081 16019 1 no 13 3 3 0 1 3 2 1 0 0 0
24 1 travel_rarely 984 research & development 17 2 life sciences 4 female 97 3 1 laboratory technician 2 married 2210 3372 1 no 13 3 1 1 1 3 1 1 0 0 0

Outliers Removed (Model m3)

Column

Model m3 w/o outliers

  • Does the model fit?
Estimate Std. Error z value Pr(>|z|)
(Intercept) 11.08086 2.88543 3.84027 0.00012
age -0.37035 0.10612 -3.48975 0.00048
I(age^2) 0.00440 0.00137 3.20619 0.00135
businesstraveltravel_rarely 2.47822 0.75830 3.26814 0.00108
businesstraveltravel_frequently 4.10756 0.82111 5.00244 0
dailyrate -0.00068 0.00035 -1.93914 0.05248
distancefromhome 0.06318 0.01701 3.71407 2e-04
education 0.00557 0.13620 0.04088 0.96739
educationfieldlife sciences -1.58314 0.95850 -1.65169 0.0986
educationfieldmarketing -0.35779 0.99758 -0.35866 0.71985
educationfieldmedical -1.87098 0.97454 -1.91986 0.05488
educationfieldother -3.10154 1.26781 -2.44638 0.01443
educationfieldtechnical degree 0.04804 1.00214 0.04794 0.96177
environmentsatisfaction -0.91278 0.14094 -6.47650 0
hourlyrate -0.00180 0.00685 -0.26285 0.79266
jobinvolvement -1.22807 0.20255 -6.06299 0
jobsatisfaction -0.43131 0.12783 -3.37418 0.00074
maritalstatusmarried -1.35908 0.40263 -3.37548 0.00074
maritalstatusdivorced -1.38510 0.54032 -2.56348 0.01036
monthlyrate 0.00000 0.00002 0.20644 0.83645
numcompaniesworked 0.30306 0.05912 5.12593 0
overtimeyes 3.13624 0.33275 9.42510 0
percentsalaryhike -0.04956 0.06186 -0.80120 0.42302
performancerating 0.49344 0.64629 0.76349 0.44517
stockoptionlevel -0.21974 0.23932 -0.91819 0.35852
totalworkingyears -0.22275 0.04092 -5.44291 0
trainingtimeslastyear -0.28519 0.11009 -2.59051 0.00958
yearsincurrentrole -0.24845 0.07696 -3.22814 0.00125
yearssincelastpromotion 0.23844 0.06847 3.48225 5e-04
yearswithcurrmanager 0.08868 0.07987 1.11028 0.26688
Residual.Deviance Residual.df
366.584 965
  • We see that model m3 without outliers in the data still fits by \(\frac{deviance_{res}}{df_{res}} \leq 1\). Check the marginal model plots.

Column

Marginal Model Plots (scrollable)

  • Check the marginal model plots (mmp) of the quantitative predictors to see if the model is specified correctly

Observations on model m3

  • After specifying the predictors correctly earlier, we saw that the linear fit of the model could, potentially, still improve. Therefore, we inspected the leverage and looked for outliers.

  • We identified 34 outliers in the training set, and, for the purpose of this analysis, removed those outliers under the assumption that there was sufficient reason to do so. Recall, this is a simulated data set.

  • After removing the outliers from the training set and re-running model m3 with the new data, we find that the overall linear fit of the model improved greatly.

  • There still appears to be several predictors that are not significant, though. Let’s run step-wise variable selection to see if we can further reduce the number of predictors in the model.

Variable Selection (Model m3)

Column

Fwd/Bwd Stepwise Selection

Stepwise Model Path 
Analysis of Deviance Table

Initial Model:
attrition ~ age + I(age^2) + businesstravel + dailyrate + distancefromhome + 
    education + educationfield + environmentsatisfaction + hourlyrate + 
    jobinvolvement + jobsatisfaction + maritalstatus + monthlyrate + 
    numcompaniesworked + overtime + percentsalaryhike + performancerating + 
    stockoptionlevel + totalworkingyears + trainingtimeslastyear + 
    yearsincurrentrole + yearssincelastpromotion + yearswithcurrmanager

Final Model:
attrition ~ age + I(age^2) + businesstravel + dailyrate + distancefromhome + 
    educationfield + environmentsatisfaction + jobinvolvement + 
    jobsatisfaction + maritalstatus + numcompaniesworked + overtime + 
    totalworkingyears + trainingtimeslastyear + yearsincurrentrole + 
    yearssincelastpromotion


                    Step Df    Deviance Resid. Df Resid. Dev      AIC
1                                             965   366.5840 426.5840
2            - education  1 0.001671317       966   366.5857 424.5857
3          - monthlyrate  1 0.041329610       967   366.6270 422.6270
4           - hourlyrate  1 0.062737836       968   366.6898 420.6898
5    - performancerating  1 0.541018729       969   367.2308 419.2308
6    - percentsalaryhike  1 0.110578520       970   367.3414 417.3414
7     - stockoptionlevel  1 0.976428369       971   368.3178 416.3178
8 - yearswithcurrmanager  1 1.012913703       972   369.3307 415.3307

Column

Observations

  • Step-wise variable selection on model m3 removes seven variables.

  • All seven variables removed from the model were not previously significant.

  • Interestingly, \(dailyrate\) was not removed from the model even though it was not statistically significant before.

Final Logistic Regr Model

Final Logistic Regression Model

\[\begin{align} logit[P(attrition = 1 (Yes))] = &\beta_0 + \beta_1age + \beta_{1a}age^2 + \beta_2businesstravel + \\ &\beta_3dailyrate + \beta_5distancefromhome + \\ &\beta_7educationfield + \beta_8environmentsatisfaction + \\ &\beta_{11}jobinvolvement + \beta_{14}jobsatisfaction + \\ &\beta_{15}maritalstatus + \beta_{18}numcompaniesworked + \\ &\beta_{19}overtime + \beta_{24}totalworkingyears + \beta_{25}trainingtimeslastyear + \\ &\beta_{28}yearsincurrentrole + \\ &\beta_{29}yearssincelastpromotion \end{align}\]

Estimate Std. Error z value Pr(>|z|)
(Intercept) 11.86865 2.34774 5.05535 0
age -0.36314 0.10282 -3.53188 0.00041
I(age^2) 0.00426 0.00133 3.20066 0.00137
businesstraveltravel_rarely 2.44736 0.75063 3.26041 0.00111
businesstraveltravel_frequently 4.06929 0.81015 5.02287 0
dailyrate -0.00064 0.00035 -1.86346 0.0624
distancefromhome 0.06069 0.01662 3.65113 0.00026
educationfieldlife sciences -1.76682 0.91870 -1.92317 0.05446
educationfieldmarketing -0.53888 0.95366 -0.56507 0.57203
educationfieldmedical -2.07068 0.92676 -2.23431 0.02546
educationfieldother -3.11794 1.23414 -2.52640 0.01152
educationfieldtechnical degree -0.13154 0.95904 -0.13716 0.89091
environmentsatisfaction -0.90788 0.13996 -6.48653 0
jobinvolvement -1.19419 0.19810 -6.02811 0
jobsatisfaction -0.42537 0.12569 -3.38423 0.00071
maritalstatusmarried -1.60791 0.32068 -5.01405 0
maritalstatusdivorced -1.68362 0.40700 -4.13666 4e-05
numcompaniesworked 0.29720 0.05879 5.05517 0
overtimeyes 3.13669 0.32874 9.54165 0
totalworkingyears -0.20686 0.03877 -5.33509 0
trainingtimeslastyear -0.29303 0.10945 -2.67735 0.00742
yearsincurrentrole -0.20462 0.06522 -3.13754 0.0017
yearssincelastpromotion 0.26748 0.06570 4.07117 5e-05
Residual.Deviance Residual.df
369.3307 972

SLR Model

Column

Notes

  • Before applying the Lasso method, we consider the following:
    • \(gender\), \(relationshipsatisfaction\) and \(worklifebalance\) are removed from the model because of their independence from the response variable (\(attrition\)). See Logistic Regression \(\rightarrow\) Check X vs. Y Independence.
    • The 31 outliers identified in section Logistic Regression \(\rightarrow\) Leverage were removed from the training data set prior to reduce the effects of outliers on the model.
    • \(department\), \(joblevel\), \(jobrole\), \(monthlyincome\), and \(yearsatcompany\) were removed because of high VIF/GVIF values. See Logistic Regression \(\rightarrow\) Check VIFs.
    • The quadratic term \(age^2\) is added to the model because we saw earlier that \(age\) is misspecified in the model. See Logistic Regression \(\rightarrow\) Reduced Model(m2) and Reduced Model(m3) sections.
    • For the remaining predictor terms, we will apply the Lasso method for variable selection.
    • The starting model for SLR (lasso) is the same as model m3. See Logistic Regression \(\rightarrow\) Reduced Model(m3).


Model before applying SLR (lasso)

\[\begin{align} logit[P(attrition = 1 (Yes))] = &\beta_0 + \beta_1age + \beta_{1a}age^2 + \beta_2businesstravel + \\ &\beta_3dailyrate + \beta_5distancefromhome + \beta_6education + \\ &\beta_7educationfield + \beta_8environmentsatisfaction + \beta_{10}hourlyrate + \\ &\beta_{11}jobinvolvement + \beta_{14}jobsatisfaction + \\ &\beta_{15}maritalstatus + \beta_{17}monthlyrate + \beta_{18}numcompaniesworked + \\ &\beta_{19}overtime + \beta_{20}percentsalaryhike + \beta_{21}performancerating + \\ &\beta_{23}stockoptionlevel + \beta_{24}totalworkingyears + \beta_{25}trainingtimeslastyear + \\ &\beta_{28}yearsincurrentrole + \\ &\beta_{29}yearssincelastpromotion + \beta_{30}yearswithcurrmanager \end{align}\]

Sparse Logistic Regression (SLR) Model Fit

Column

SLR Coefficients

coef_for_lambda_min coef_for_lambda_1se
(Intercept) 3.65983374026907 -0.421810470939189
age -0.24524238227586 -0.0191318358586904
age_sq 0.00283512351833354 0
businesstravel 1.56532271480022 0.817320668532357
dailyrate -0.000601615403144011 -0.0002201357094025
distancefromhome 0.0491562155026073 0.0190447607580739
education 0.00592522968043428 0
educationfield 0.129915310062553 0.0127507615619703
environmentsatisfaction -0.752816272518327 -0.408482786862819
hourlyrate -0.00238420670969245 0
jobinvolvement -1.06144472857087 -0.620078680375977
jobsatisfaction -0.416612406296805 -0.198859784912876
maritalstatus -0.602899629258942 -0.405845309268576
monthlyrate 3.79991366887273e-06 0
numcompaniesworked 0.270628341008726 0.108657243387995
overtime 2.73062139115611 1.83780606141179
percentsalaryhike -0.041483014849234 0
performancerating 0.358197380191492 0
stockoptionlevel -0.354735806848068 -0.1823727134757
totalworkingyears -0.208366172960455 -0.10611210161498
trainingtimeslastyear -0.247066913046405 -0.0700495155574719
yearsincurrentrole -0.191047848229074 -0.0619798594440491
yearssincelastpromotion 0.206531430074502 0.0053199843505578
yearswithcurrmanager 0.0439659692332794 0

Logistic Regression Comparisons

Column

Final Logistic Regr Coefficients by Step-wise Var Select

Estimate Std. Error z value Pr(>|z|)
(Intercept) 11.86865 2.34774 5.05535 0
age -0.36314 0.10282 -3.53188 0.00041
I(age^2) 0.00426 0.00133 3.20066 0.00137
businesstraveltravel_rarely 2.44736 0.75063 3.26041 0.00111
businesstraveltravel_frequently 4.06929 0.81015 5.02287 0
dailyrate -0.00064 0.00035 -1.86346 0.0624
distancefromhome 0.06069 0.01662 3.65113 0.00026
educationfieldlife sciences -1.76682 0.91870 -1.92317 0.05446
educationfieldmarketing -0.53888 0.95366 -0.56507 0.57203
educationfieldmedical -2.07068 0.92676 -2.23431 0.02546
educationfieldother -3.11794 1.23414 -2.52640 0.01152
educationfieldtechnical degree -0.13154 0.95904 -0.13716 0.89091
environmentsatisfaction -0.90788 0.13996 -6.48653 0
jobinvolvement -1.19419 0.19810 -6.02811 0
jobsatisfaction -0.42537 0.12569 -3.38423 0.00071
maritalstatusmarried -1.60791 0.32068 -5.01405 0
maritalstatusdivorced -1.68362 0.40700 -4.13666 4e-05
numcompaniesworked 0.29720 0.05879 5.05517 0
overtimeyes 3.13669 0.32874 9.54165 0
totalworkingyears -0.20686 0.03877 -5.33509 0
trainingtimeslastyear -0.29303 0.10945 -2.67735 0.00742
yearsincurrentrole -0.20462 0.06522 -3.13754 0.0017
yearssincelastpromotion 0.26748 0.06570 4.07117 5e-05

Column

Final Logistic Regr Coef by Lasso Var Select

coef_for_lambda_min coef_for_lambda_1se
(Intercept) 3.65983374026907 -0.421810470939189
age -0.24524238227586 -0.0191318358586904
age_sq 0.00283512351833354 0
businesstravel 1.56532271480022 0.817320668532357
dailyrate -0.000601615403144011 -0.0002201357094025
distancefromhome 0.0491562155026073 0.0190447607580739
education 0.00592522968043428 0
educationfield 0.129915310062553 0.0127507615619703
environmentsatisfaction -0.752816272518327 -0.408482786862819
hourlyrate -0.00238420670969245 0
jobinvolvement -1.06144472857087 -0.620078680375977
jobsatisfaction -0.416612406296805 -0.198859784912876
maritalstatus -0.602899629258942 -0.405845309268576
monthlyrate 3.79991366887273e-06 0
numcompaniesworked 0.270628341008726 0.108657243387995
overtime 2.73062139115611 1.83780606141179
percentsalaryhike -0.041483014849234 0
performancerating 0.358197380191492 0
stockoptionlevel -0.354735806848068 -0.1823727134757
totalworkingyears -0.208366172960455 -0.10611210161498
trainingtimeslastyear -0.247066913046405 -0.0700495155574719
yearsincurrentrole -0.191047848229074 -0.0619798594440491
yearssincelastpromotion 0.206531430074502 0.0053199843505578
yearswithcurrmanager 0.0439659692332794 0

RF Model

Column

Random Forest Model

  • We’ll use the whole training set for the Random Forest model. It should be less affected by colinearity.

  • We’ll use the training set to identify important variables to keep and then re-run a more sparse model before measuring performance.

  • We’ll also build two initial models based on: 1) training set w/ outliers and 2) training set w/o outliers

  • For the plots to the right:

    • Green - class error for class 1 (i.e., \(attrition\) = 1 (yes))
    • Red - class error for class 0 (i.e., \(attrition\) = 0 (no))
    • Black - out of bag error
    • NOTE: we see lower error for class 0 because there are more “No” responses to learn from in the data

Column

RF Model Plot (w/ outliers)

RF Model Plot (w/o outliers)

Variable Importance (w/ Outliers)

Column

Variable Importance (w/ outliers) - Mean Decr in Accuracy

Variable Importance Plot (w/ outliers) - Mean Decr in Accuracy

Column

Variable Importance (w/ outliers) - Mean Decr in Node Impurity

Variable Importance Plot (w/ outliers) - Mean Decr in Node Impurity

Variable Importance (w/o Outliers)

Column

Variable Importance (w/o outliers) - Mean Decr in Accuracy

Variable Importance Plot (w/o outliers) - Mean Decr in Accuracy

Column

Variable Importance (w/o outliers) - Mean Decr in Node Impurity

Variable Importance Plot (w/0 outliers) - Mean Decr in Node Impurity

Observations

Observations

  • To compare bar plots, I chose to focus on predictor variables with a Mean Decrease in Accuracy \(\geq 5\) for identifying important variables and to find a potentially more sparse model.

  • Based on the selected cut-off above, we will focus on the following as the most important since they agree for both random forest models, regardless of training data containing or not containing outliers:

    • \(age\)
    • \(environmentsatisfaction\)
    • \(joblevel\)
    • \(jobrole\)
    • \(maritalstatus\)
    • \(monthlyincome\)
    • \(overtime\)
    • \(stockoptionlevel\)
    • \(totalworkingyears\)
    • \(yearsatcompany\)
  • Recall the variables noted as having correlations (colinearity) from the Initial Observations/Notes section. Although randomForest can handle data with colinearities, we see here that several correlated variables were given high importance. Particularly, consider the correlated relationships among \(age\), \(joblevel\), \(monthlyincome\), \(totalworkingyears\), and \(yearsatcompany\).

  • Let’s rerun an RF model on data that does not have the following variables:

    • \(totalworkingyears\)
      • because a company may or may not know this info
      • we’re assuming that the total number of years a person has been working is irrelevant regarding attrition (i.e., you can quit at any time, you can get another offer at any time, you can be fired at any time…all of which don’t necessarily account for how long you’ve been in the workforce.)
    • \(yearsatcompany\)
      • this is more of an ‘umbrella’ measure that can overlap with other variables it’s correlated with (ex. \(yearswithcurrentmanager\))
      • correlated with \(monthlyincome\)
    • \(monthlyincome\)
      • it’s correlated with \(age\) and \(joblevel\)
      • it’s reasonable to expect that \(monthlyincome\) will be greater with a higher \(age\) and/or \(joblevel\)

Variable Importance 2 (w/ Outliers)

Column

Variable Importance (w/ outliers) - Mean Decr in Accuracy

Variable Importance Plot (w/ outliers) - Mean Decr in Accuracy

Column

Variable Importance (w/ outliers) - Mean Decr in Node Impurity

Variable Importance Plot (w/ outliers) - Mean Decr in Node Impurity

Variable Importance 2 (w/o Outliers)

Column

Variable Importance (w/o outliers) - Mean Decr in Accuracy

Variable Importance Plot (w/o outliers) - Mean Decr in Accuracy

Column

Variable Importance (w/o outliers) - Mean Decr in Node Impurity

Variable Importance Plot (w/0 outliers) - Mean Decr in Node Impurity

Observations 2

Observations

  • Again, to compare bar plots, I chose to focus on predictor variables with a Mean Decrease in Accuracy \(\geq 5\) for identifying important variables and to find a potentially more sparse model.

  • Based on the selected cut-offs above, we will focus on the following as the most important based on the Mean Decrease in Accuracy, of which these variables appeared in both RF model where data contained outliers and the RF model where the data did not contain outliers:

    • \(age\)
    • \(educationfield\)
    • \(environmentsatisfaction\)
    • \(jobinvolvement\)
    • \(joblevel\)
    • \(jobrole\)
    • \(maritalstatus\)
    • \(numcompaniesworked\)
    • \(overtime\)
    • \(stockoptionlevel\)
    • \(yearsincurrentrole\)
  • Now, let’s develop a sparse random forest model that uses only the 11 most important variables we just identified.

RF Reduced Model

Column

RF Reduced Model Plot (w/ outliers)

RF Reduced Model Plot (w/o outliers)

Model Performance

Column

Performance - test set

AUC Correct Classification Rate Misclassification Rate
Logistic Regr 0.811543920517271 0.87075 0.12925
SLR (lambda.min) 0.816905850812178 0.85488 0.14512
SLR (lambda.1se) 0.797035167954585 0.86168 0.13832
RF (Saturated Model)* 0.822504336855386 0.86168 0.13832
RF (Saturated Model) 0.820690742785049 0.86621 0.13379
RF (Reduced Model)* 0.82069074278505 0.86395 0.13605
RF (Reduced Model) 0.823115439205173 0.86848 0.13152
* Training set contained outliers during model building.

Column

Receiver Operating Curves (ROCs)

Predictions

Predicted Probabilities & Class

test_data_index attrition Logistic Regr Prob Logistic Regr Class SLR (lambda_min) Prob SLR (lambda_min) Class SLR (lambda_1se) Prob SLR (lambda_1se) Class RF_full_w/_outliers Prob RF_full_w/_outliers Class RF_full_w/o_outliers Prob RF_full_w/o_outliers Class RF_reduced_w/_outliers Prob RF_reduced_w/_outliers Class RF_reduced_w/o_outliers Prob RF_reduced_w/o_outliers Class
1 1 0.1337551 0 0.4187221 0 0.4508243 0 0.357 0 0.339 0 0.357 0 0.355 0
2 0 0.0045126 0 0.0083202 0 0.0472341 0 0.066 0 0.074 0 0.093 0 0.077 0
3 0 0.0270882 0 0.0334978 0 0.0608848 0 0.272 0 0.224 0 0.296 0 0.208 0
4 0 0.0345040 0 0.0621377 0 0.1246250 0 0.062 0 0.042 0 0.057 0 0.041 0
5 0 0.0064818 0 0.0142927 0 0.0536165 0 0.410 0 0.347 0 0.423 0 0.338 0
6 0 0.0001915 0 0.0022330 0 0.0182444 0 0.091 0 0.065 0 0.092 0 0.053 0
7 0 0.1184722 0 0.1005978 0 0.1512744 0 0.294 0 0.210 0 0.301 0 0.211 0
8 0 0.0072444 0 0.0185718 0 0.0716059 0 0.113 0 0.082 0 0.117 0 0.069 0
9 0 0.0008074 0 0.0191949 0 0.0888132 0 0.080 0 0.036 0 0.067 0 0.044 0
10 1 0.3216325 0 0.4173543 0 0.3328851 0 0.353 0 0.319 0 0.389 0 0.349 0
11 0 0.8813008 1 0.8918318 1 0.7215041 1 0.319 0 0.334 0 0.327 0 0.350 0
12 0 0.0585817 0 0.1364132 0 0.1505066 0 0.187 0 0.127 0 0.190 0 0.131 0
13 0 0.1095135 0 0.2457727 0 0.2106443 0 0.114 0 0.138 0 0.139 0 0.126 0
14 0 0.2009834 0 0.2346135 0 0.3191659 0 0.035 0 0.021 0 0.035 0 0.026 0
15 0 0.1199910 0 0.3371876 0 0.2492579 0 0.091 0 0.067 0 0.112 0 0.056 0
16 0 0.0017555 0 0.0032567 0 0.0309785 0 0.031 0 0.024 0 0.040 0 0.024 0
17 0 0.0037252 0 0.0582621 0 0.1277394 0 0.110 0 0.077 0 0.113 0 0.080 0
18 0 0.0781689 0 0.1094613 0 0.1650881 0 0.166 0 0.110 0 0.189 0 0.122 0
19 0 0.0006637 0 0.0010312 0 0.0106097 0 0.208 0 0.192 0 0.236 0 0.191 0
20 0 0.2381318 0 0.0984095 0 0.0927205 0 0.353 0 0.290 0 0.365 0 0.278 0
21 0 0.0053633 0 0.0038909 0 0.0183735 0 0.095 0 0.045 0 0.097 0 0.046 0
22 0 0.0108929 0 0.0196967 0 0.0497142 0 0.146 0 0.102 0 0.180 0 0.091 0
23 0 0.0070470 0 0.0719895 0 0.1058973 0 0.172 0 0.157 0 0.163 0 0.137 0
24 0 0.0000030 0 0.0000095 0 0.0007603 0 0.179 0 0.070 0 0.174 0 0.075 0
25 0 0.0351221 0 0.0557567 0 0.1261415 0 0.102 0 0.105 0 0.138 0 0.096 0
26 0 0.0264458 0 0.0446655 0 0.1381170 0 0.185 0 0.173 0 0.180 0 0.163 0
27 0 0.0112697 0 0.0307188 0 0.0613374 0 0.110 0 0.109 0 0.114 0 0.085 0
28 0 0.0028556 0 0.0045635 0 0.0385197 0 0.144 0 0.092 0 0.107 0 0.103 0
29 0 0.0005014 0 0.0012573 0 0.0178767 0 0.079 0 0.018 0 0.063 0 0.016 0
30 0 0.0007068 0 0.0046479 0 0.0173095 0 0.108 0 0.030 0 0.093 0 0.031 0
31 0 0.0586538 0 0.0230449 0 0.0534774 0 0.143 0 0.071 0 0.148 0 0.072 0
32 0 0.0075223 0 0.0110632 0 0.0773356 0 0.125 0 0.051 0 0.123 0 0.092 0
33 1 0.1908698 0 0.1574158 0 0.2026731 0 0.164 0 0.132 0 0.142 0 0.148 0
34 0 0.4100984 0 0.4752593 0 0.2932294 0 0.226 0 0.197 0 0.231 0 0.219 0
35 0 0.0351379 0 0.0123723 0 0.0818762 0 0.230 0 0.202 0 0.228 0 0.214 0
36 1 0.1919437 0 0.3031714 0 0.2444309 0 0.366 0 0.232 0 0.413 0 0.243 0
37 0 0.0105224 0 0.0223195 0 0.0662741 0 0.178 0 0.144 0 0.157 0 0.137 0
38 0 0.0859349 0 0.0365784 0 0.0975079 0 0.221 0 0.194 0 0.228 0 0.198 0
39 0 0.0046028 0 0.0071235 0 0.0466816 0 0.159 0 0.094 0 0.137 0 0.098 0
40 0 0.0086627 0 0.0244027 0 0.0640974 0 0.099 0 0.040 0 0.095 0 0.027 0
41 0 0.0093444 0 0.0388963 0 0.0796283 0 0.050 0 0.030 0 0.042 0 0.019 0
42 0 0.0029099 0 0.0038009 0 0.0256727 0 0.118 0 0.031 0 0.105 0 0.044 0
43 0 0.0360528 0 0.1121357 0 0.2111306 0 0.309 0 0.327 0 0.303 0 0.329 0
44 0 0.0008743 0 0.0011603 0 0.0126282 0 0.059 0 0.018 0 0.067 0 0.009 0
45 0 0.0114729 0 0.0287557 0 0.0527610 0 0.171 0 0.126 0 0.177 0 0.126 0
46 0 0.0001516 0 0.0008493 0 0.0068252 0 0.115 0 0.048 0 0.121 0 0.054 0
47 0 0.0328435 0 0.0494563 0 0.1402950 0 0.195 0 0.104 0 0.223 0 0.131 0
48 0 0.0404809 0 0.0253462 0 0.0600043 0 0.182 0 0.189 0 0.196 0 0.184 0
49 0 0.0134792 0 0.0264300 0 0.0732834 0 0.144 0 0.086 0 0.125 0 0.077 0
50 0 0.0078715 0 0.0297649 0 0.0729885 0 0.075 0 0.044 0 0.087 0 0.046 0
51 0 0.0007378 0 0.0021073 0 0.0104622 0 0.113 0 0.025 0 0.076 0 0.022 0
52 0 0.0012921 0 0.0070192 0 0.0530667 0 0.022 0 0.013 0 0.031 0 0.008 0
53 0 0.0024030 0 0.0117787 0 0.0162441 0 0.117 0 0.057 0 0.131 0 0.051 0
54 0 0.0026386 0 0.0025510 0 0.0210910 0 0.125 0 0.048 0 0.130 0 0.057 0
55 1 0.3284143 0 0.4939918 0 0.4094639 0 0.228 0 0.219 0 0.216 0 0.201 0
56 1 0.0288190 0 0.0126501 0 0.0386043 0 0.153 0 0.103 0 0.158 0 0.101 0
57 1 0.0024981 0 0.0166987 0 0.0295823 0 0.174 0 0.138 0 0.170 0 0.148 0
58 0 0.0225045 0 0.0327220 0 0.0979431 0 0.064 0 0.067 0 0.065 0 0.071 0
59 1 0.8471049 1 0.6223316 1 0.4343002 0 0.310 0 0.307 0 0.317 0 0.323 0
60 0 0.0004397 0 0.0020089 0 0.0062553 0 0.134 0 0.057 0 0.125 0 0.082 0
61 0 0.0009878 0 0.0017079 0 0.0213123 0 0.070 0 0.041 0 0.078 0 0.032 0
62 0 0.0389992 0 0.1883261 0 0.1707194 0 0.197 0 0.083 0 0.187 0 0.108 0
63 0 0.0200781 0 0.0655895 0 0.1637667 0 0.119 0 0.071 0 0.083 0 0.052 0
64 0 0.0897300 0 0.2285908 0 0.0906802 0 0.269 0 0.204 0 0.267 0 0.178 0
65 0 0.0000024 0 0.0000114 0 0.0009173 0 0.077 0 0.013 0 0.063 0 0.019 0
66 0 0.0265708 0 0.0218304 0 0.0274346 0 0.275 0 0.261 0 0.288 0 0.247 0
67 0 0.0013741 0 0.0050014 0 0.0393164 0 0.160 0 0.075 0 0.141 0 0.051 0
68 1 0.3336486 0 0.3327704 0 0.4128249 0 0.313 0 0.334 0 0.354 0 0.348 0
69 0 0.0046679 0 0.0058214 0 0.0276992 0 0.117 0 0.060 0 0.087 0 0.050 0
70 0 0.0027885 0 0.0051027 0 0.0313399 0 0.168 0 0.108 0 0.172 0 0.094 0
71 0 0.0212682 0 0.0376949 0 0.0917933 0 0.039 0 0.008 0 0.043 0 0.012 0
72 1 0.0163098 0 0.0245831 0 0.0654238 0 0.152 0 0.103 0 0.164 0 0.095 0
73 0 0.0000321 0 0.0000938 0 0.0049418 0 0.044 0 0.009 0 0.031 0 0.010 0
74 0 0.0075065 0 0.0276098 0 0.0598509 0 0.259 0 0.150 0 0.255 0 0.151 0
75 0 0.0001282 0 0.0006631 0 0.0078769 0 0.090 0 0.051 0 0.093 0 0.037 0
76 0 0.1166397 0 0.0694775 0 0.1247120 0 0.201 0 0.148 0 0.201 0 0.157 0
77 0 0.0401757 0 0.0549363 0 0.1314545 0 0.246 0 0.207 0 0.210 0 0.201 0
78 0 0.0000267 0 0.0001233 0 0.0029547 0 0.059 0 0.054 0 0.066 0 0.059 0
79 1 0.1957583 0 0.3580698 0 0.1771279 0 0.114 0 0.104 0 0.126 0 0.100 0
80 0 0.0006776 0 0.0035939 0 0.0347680 0 0.064 0 0.028 0 0.059 0 0.028 0
81 0 0.0159722 0 0.0243066 0 0.1001524 0 0.094 0 0.073 0 0.090 0 0.077 0
82 0 0.9448432 1 0.9664750 1 0.7495044 1 0.211 0 0.237 0 0.217 0 0.243 0
83 0 0.0000451 0 0.0001158 0 0.0053106 0 0.026 0 0.027 0 0.044 0 0.030 0
84 1 0.5987490 1 0.6246445 1 0.4462352 0 0.368 0 0.388 0 0.366 0 0.363 0
85 0 0.0034845 0 0.0056455 0 0.0445092 0 0.079 0 0.054 0 0.061 0 0.046 0
86 0 0.4357993 0 0.2672650 0 0.0966984 0 0.218 0 0.185 0 0.214 0 0.181 0
87 0 0.0498006 0 0.0314534 0 0.0725392 0 0.333 0 0.223 0 0.318 0 0.248 0
88 0 0.0027430 0 0.0016405 0 0.0145365 0 0.075 0 0.011 0 0.075 0 0.004 0
89 0 0.0050480 0 0.0096339 0 0.0363607 0 0.080 0 0.062 0 0.088 0 0.057 0
90 0 0.0195947 0 0.0144163 0 0.0241114 0 0.195 0 0.087 0 0.178 0 0.076 0
91 0 0.0000012 0 0.0000029 0 0.0008156 0 0.039 0 0.004 0 0.032 0 0.005 0
92 0 0.0058197 0 0.0079097 0 0.0613316 0 0.069 0 0.061 0 0.086 0 0.041 0
93 0 0.0817179 0 0.0605762 0 0.1678354 0 0.188 0 0.198 0 0.206 0 0.209 0
94 0 0.0054617 0 0.0050526 0 0.0168789 0 0.121 0 0.045 0 0.085 0 0.050 0
95 0 0.0953937 0 0.0869177 0 0.1771020 0 0.274 0 0.279 0 0.271 0 0.315 0
96 0 0.0192653 0 0.0632697 0 0.0712673 0 0.171 0 0.176 0 0.189 0 0.128 0
97 0 0.2406020 0 0.0936175 0 0.1472987 0 0.178 0 0.159 0 0.158 0 0.169 0
98 0 0.4717975 0 0.4318131 0 0.4251381 0 0.380 0 0.423 0 0.427 0 0.438 0
99 0 0.1507289 0 0.0652438 0 0.1997076 0 0.141 0 0.136 0 0.152 0 0.127 0
100 0 0.0000592 0 0.0002153 0 0.0055337 0 0.065 0 0.008 0 0.063 0 0.010 0
101 0 0.0026949 0 0.0042893 0 0.0236133 0 0.073 0 0.050 0 0.074 0 0.052 0
102 0 0.0008899 0 0.0083315 0 0.0148236 0 0.096 0 0.025 0 0.070 0 0.038 0
103 0 0.0019166 0 0.0005646 0 0.0121800 0 0.085 0 0.053 0 0.087 0 0.047 0
104 0 0.3869147 0 0.4860831 0 0.4589995 0 0.106 0 0.094 0 0.099 0 0.096 0
105 0 0.0218355 0 0.0200540 0 0.0718735 0 0.127 0 0.125 0 0.140 0 0.113 0
106 0 0.0000892 0 0.0007260 0 0.0145025 0 0.020 0 0.008 0 0.023 0 0.012 0
107 0 0.0878294 0 0.0471688 0 0.1305889 0 0.353 0 0.343 0 0.364 0 0.339 0
108 0 0.0120081 0 0.0306890 0 0.0620194 0 0.212 0 0.155 0 0.205 0 0.153 0
109 0 0.0023978 0 0.0143372 0 0.0851694 0 0.143 0 0.142 0 0.162 0 0.166 0
110 0 0.0087125 0 0.0472126 0 0.0974707 0 0.175 0 0.041 0 0.171 0 0.051 0
111 0 0.0002587 0 0.0016454 0 0.0096041 0 0.117 0 0.031 0 0.115 0 0.039 0
112 0 0.0374439 0 0.0881227 0 0.1046851 0 0.289 0 0.266 0 0.301 0 0.259 0
113 1 0.6602630 1 0.4250006 0 0.4205235 0 0.539 1 0.533 1 0.558 1 0.536 1
114 0 0.0008149 0 0.0041854 0 0.0291088 0 0.098 0 0.045 0 0.072 0 0.035 0
115 0 0.0010930 0 0.0040304 0 0.0241975 0 0.129 0 0.063 0 0.150 0 0.074 0
116 0 0.0295114 0 0.0338515 0 0.0955538 0 0.136 0 0.091 0 0.132 0 0.087 0
117 0 0.0001283 0 0.0004283 0 0.0080276 0 0.119 0 0.009 0 0.113 0 0.020 0
118 0 0.0035600 0 0.0021569 0 0.0186939 0 0.040 0 0.030 0 0.037 0 0.018 0
119 0 0.0006537 0 0.0047012 0 0.0276844 0 0.081 0 0.060 0 0.096 0 0.063 0
120 0 0.0031794 0 0.0042375 0 0.0596358 0 0.057 0 0.046 0 0.084 0 0.043 0
121 0 0.0010254 0 0.0008358 0 0.0031248 0 0.324 0 0.157 0 0.298 0 0.124 0
122 0 0.0012098 0 0.0016650 0 0.0211428 0 0.038 0 0.005 0 0.039 0 0.004 0
123 0 0.0074586 0 0.0204373 0 0.0403816 0 0.080 0 0.052 0 0.082 0 0.034 0
124 0 0.0031283 0 0.0053851 0 0.0063269 0 0.229 0 0.202 0 0.246 0 0.176 0
125 0 0.0000204 0 0.0000878 0 0.0043743 0 0.043 0 0.010 0 0.052 0 0.014 0
126 0 0.0166280 0 0.0087690 0 0.0479625 0 0.212 0 0.051 0 0.157 0 0.056 0
127 0 0.0011968 0 0.0022213 0 0.0144465 0 0.018 0 0.008 0 0.008 0 0.009 0
128 0 0.0552652 0 0.0235869 0 0.1225138 0 0.077 0 0.057 0 0.100 0 0.051 0
129 0 0.0000739 0 0.0001878 0 0.0090469 0 0.055 0 0.044 0 0.060 0 0.032 0
130 0 0.0458771 0 0.0251006 0 0.0763054 0 0.375 0 0.363 0 0.375 0 0.336 0
131 1 0.1582966 0 0.2847517 0 0.2313425 0 0.285 0 0.259 0 0.298 0 0.259 0
132 1 0.9867730 1 0.7702262 1 0.5484662 1 0.294 0 0.250 0 0.291 0 0.266 0
133 0 0.0040325 0 0.0181613 0 0.0283898 0 0.176 0 0.046 0 0.142 0 0.048 0
134 0 0.0010971 0 0.0006741 0 0.0054194 0 0.133 0 0.119 0 0.135 0 0.113 0
135 0 0.0001922 0 0.0010149 0 0.0119226 0 0.087 0 0.057 0 0.078 0 0.054 0
136 0 0.0047337 0 0.0206817 0 0.0684658 0 0.102 0 0.082 0 0.093 0 0.063 0
137 0 0.1861249 0 0.1726628 0 0.3146860 0 0.277 0 0.300 0 0.282 0 0.269 0
138 0 0.0165069 0 0.0676318 0 0.1193857 0 0.115 0 0.079 0 0.101 0 0.076 0
139 0 0.0000558 0 0.0011592 0 0.0133743 0 0.048 0 0.008 0 0.051 0 0.009 0
140 0 0.0013361 0 0.0025209 0 0.0160384 0 0.121 0 0.062 0 0.100 0 0.045 0
141 0 0.0198842 0 0.0084451 0 0.0488225 0 0.072 0 0.041 0 0.067 0 0.046 0
142 0 0.0010625 0 0.0060920 0 0.0200540 0 0.141 0 0.103 0 0.143 0 0.088 0
143 0 0.0001299 0 0.0005179 0 0.0133263 0 0.044 0 0.012 0 0.052 0 0.013 0
144 0 0.0000461 0 0.0000681 0 0.0072878 0 0.049 0 0.034 0 0.048 0 0.028 0
145 0 0.0283510 0 0.2112498 0 0.1671991 0 0.226 0 0.148 0 0.240 0 0.104 0
146 1 0.0133459 0 0.0344811 0 0.0709359 0 0.180 0 0.124 0 0.164 0 0.117 0
147 0 0.0003722 0 0.0008136 0 0.0143468 0 0.026 0 0.007 0 0.030 0 0.004 0
148 0 0.0316823 0 0.0770000 0 0.1231009 0 0.080 0 0.046 0 0.087 0 0.045 0
149 0 0.0006107 0 0.0020953 0 0.0186313 0 0.096 0 0.049 0 0.088 0 0.056 0
150 0 0.0791667 0 0.2314215 0 0.1601991 0 0.106 0 0.081 0 0.125 0 0.106 0
151 0 0.1108319 0 0.1664897 0 0.2618657 0 0.280 0 0.336 0 0.290 0 0.339 0
152 0 0.0004814 0 0.0078145 0 0.0217114 0 0.025 0 0.004 0 0.023 0 0.006 0
153 0 0.0033356 0 0.0134315 0 0.0575561 0 0.083 0 0.042 0 0.087 0 0.046 0
154 0 0.0046335 0 0.0092666 0 0.0165996 0 0.183 0 0.042 0 0.180 0 0.045 0
155 0 0.0002570 0 0.0007382 0 0.0165888 0 0.015 0 0.005 0 0.009 0 0.012 0
156 0 0.0016421 0 0.0012287 0 0.0137771 0 0.101 0 0.033 0 0.101 0 0.027 0
157 0 0.9411722 1 0.6982641 1 0.2205437 0 0.371 0 0.350 0 0.371 0 0.340 0
158 0 0.0893883 0 0.0705186 0 0.0699067 0 0.164 0 0.128 0 0.172 0 0.128 0
159 0 0.0017157 0 0.0099601 0 0.0426781 0 0.106 0 0.042 0 0.099 0 0.041 0
160 0 0.1246630 0 0.1432676 0 0.0472070 0 0.272 0 0.217 0 0.260 0 0.243 0
161 0 0.1430649 0 0.0456246 0 0.0391877 0 0.138 0 0.133 0 0.146 0 0.122 0
162 0 0.0000496 0 0.0002375 0 0.0037002 0 0.150 0 0.054 0 0.150 0 0.050 0
163 0 0.1261464 0 0.2363198 0 0.3048984 0 0.318 0 0.312 0 0.294 0 0.289 0
164 0 0.0833318 0 0.1556333 0 0.1718231 0 0.127 0 0.115 0 0.121 0 0.120 0
165 0 0.0000230 0 0.0001138 0 0.0046981 0 0.114 0 0.041 0 0.102 0 0.040 0
166 0 0.0672990 0 0.0736348 0 0.1030247 0 0.087 0 0.056 0 0.096 0 0.054 0
167 0 0.1336384 0 0.3358414 0 0.2935904 0 0.457 0 0.407 0 0.460 0 0.441 0
168 1 0.0106911 0 0.0667543 0 0.1105915 0 0.227 0 0.200 0 0.252 0 0.231 0
169 0 0.7705408 1 0.6477037 1 0.4028946 0 0.364 0 0.364 0 0.363 0 0.334 0
170 0 0.6982219 1 0.7134085 1 0.4480307 0 0.214 0 0.214 0 0.243 0 0.197 0
171 0 0.0011596 0 0.0077892 0 0.0329859 0 0.034 0 0.005 0 0.027 0 0.006 0
172 0 0.0031621 0 0.0032252 0 0.0078969 0 0.048 0 0.018 0 0.032 0 0.015 0
173 1 0.2938411 0 0.2874528 0 0.2882168 0 0.387 0 0.382 0 0.392 0 0.385 0
174 1 0.8631972 1 0.8935911 1 0.7075276 1 0.386 0 0.356 0 0.406 0 0.382 0
175 0 0.0001488 0 0.0015320 0 0.0151451 0 0.103 0 0.041 0 0.101 0 0.038 0
176 0 0.0000693 0 0.0016627 0 0.0215273 0 0.046 0 0.009 0 0.050 0 0.012 0
177 1 0.0039267 0 0.0024247 0 0.0066741 0 0.190 0 0.170 0 0.204 0 0.175 0
178 0 0.0030091 0 0.0057057 0 0.0293645 0 0.081 0 0.069 0 0.095 0 0.068 0
179 0 0.0342259 0 0.0171627 0 0.0593716 0 0.139 0 0.133 0 0.124 0 0.122 0
180 0 0.5133942 1 0.6122211 1 0.4549639 0 0.209 0 0.219 0 0.240 0 0.215 0
181 0 0.1869409 0 0.1948410 0 0.2555306 0 0.124 0 0.096 0 0.118 0 0.092 0
182 1 0.0087300 0 0.0139821 0 0.0231813 0 0.169 0 0.056 0 0.159 0 0.054 0
183 0 0.1390102 0 0.0560900 0 0.1507287 0 0.148 0 0.138 0 0.144 0 0.168 0
184 0 0.5888683 1 0.4489420 0 0.2886850 0 0.371 0 0.348 0 0.369 0 0.366 0
185 0 0.0059118 0 0.0087592 0 0.0409931 0 0.083 0 0.055 0 0.097 0 0.058 0
186 0 0.0318950 0 0.0093579 0 0.0559848 0 0.288 0 0.241 0 0.296 0 0.235 0
187 0 0.0009859 0 0.0052140 0 0.0266601 0 0.039 0 0.017 0 0.044 0 0.019 0
188 0 0.0001229 0 0.0005341 0 0.0105824 0 0.101 0 0.050 0 0.098 0 0.044 0
189 0 0.0501286 0 0.0625177 0 0.0781054 0 0.239 0 0.218 0 0.217 0 0.219 0
190 0 0.0719036 0 0.4355450 0 0.3978982 0 0.284 0 0.250 0 0.282 0 0.264 0
191 1 0.5787015 1 0.4825893 0 0.2627013 0 0.370 0 0.311 0 0.336 0 0.322 0
192 0 0.1324848 0 0.0780338 0 0.1611493 0 0.208 0 0.174 0 0.203 0 0.149 0
193 0 0.0008665 0 0.0020288 0 0.0309646 0 0.080 0 0.080 0 0.090 0 0.097 0
194 0 0.0092122 0 0.0096625 0 0.0320395 0 0.114 0 0.043 0 0.130 0 0.045 0
195 0 0.0023512 0 0.0080001 0 0.0329625 0 0.032 0 0.024 0 0.028 0 0.027 0
196 0 0.1173947 0 0.3540086 0 0.3397612 0 0.410 0 0.398 0 0.403 0 0.404 0
197 0 0.0001335 0 0.0002342 0 0.0016432 0 0.142 0 0.024 0 0.135 0 0.025 0
198 1 0.9534352 1 0.9087436 1 0.7479968 1 0.642 1 0.625 1 0.639 1 0.625 1
199 0 0.0564550 0 0.0927846 0 0.1625383 0 0.170 0 0.142 0 0.156 0 0.128 0
200 0 0.1128428 0 0.1328838 0 0.1383095 0 0.117 0 0.069 0 0.100 0 0.081 0
201 0 0.0019940 0 0.0040490 0 0.0424676 0 0.093 0 0.051 0 0.103 0 0.061 0
202 0 0.1363895 0 0.1336864 0 0.2929541 0 0.437 0 0.410 0 0.423 0 0.407 0
203 1 0.2063555 0 0.1655506 0 0.3039219 0 0.209 0 0.205 0 0.216 0 0.201 0
204 1 0.1259600 0 0.4018362 0 0.3130601 0 0.364 0 0.338 0 0.374 0 0.346 0
205 0 0.0444273 0 0.0358925 0 0.0859927 0 0.061 0 0.047 0 0.082 0 0.047 0
206 0 0.0000296 0 0.0004971 0 0.0077484 0 0.126 0 0.031 0 0.132 0 0.029 0
207 0 0.0017017 0 0.0124547 0 0.0733215 0 0.085 0 0.053 0 0.088 0 0.047 0
208 1 0.9538270 1 0.7509404 1 0.5645636 1 0.520 1 0.515 1 0.535 1 0.516 1
209 0 0.1367509 0 0.0539115 0 0.1027467 0 0.278 0 0.223 0 0.276 0 0.202 0
210 1 0.9104614 1 0.9643886 1 0.7332428 1 0.719 1 0.698 1 0.720 1 0.702 1
211 0 0.0194976 0 0.0429687 0 0.0679086 0 0.184 0 0.127 0 0.161 0 0.126 0
212 0 0.0000150 0 0.0002935 0 0.0096730 0 0.046 0 0.016 0 0.042 0 0.020 0
213 0 0.0000154 0 0.0000426 0 0.0014203 0 0.012 0 0.001 0 0.017 0 0.000 0
214 0 0.0005014 0 0.0004911 0 0.0096962 0 0.104 0 0.051 0 0.100 0 0.046 0
215 0 0.0001854 0 0.0004696 0 0.0092347 0 0.047 0 0.013 0 0.037 0 0.012 0
216 0 0.0018896 0 0.0121530 0 0.0724095 0 0.077 0 0.052 0 0.080 0 0.061 0
217 0 0.0166153 0 0.0640539 0 0.1212705 0 0.152 0 0.128 0 0.142 0 0.108 0
218 0 0.0001553 0 0.0007154 0 0.0096425 0 0.044 0 0.006 0 0.047 0 0.011 0
219 0 0.0078368 0 0.0112054 0 0.0354891 0 0.070 0 0.066 0 0.070 0 0.050 0
220 1 0.1107747 0 0.3949672 0 0.3689064 0 0.311 0 0.272 0 0.310 0 0.291 0
221 0 0.0216813 0 0.0256983 0 0.1130430 0 0.114 0 0.101 0 0.114 0 0.108 0
222 1 0.2583389 0 0.3855865 0 0.3062375 0 0.255 0 0.226 0 0.267 0 0.256 0
223 0 0.0003670 0 0.0007890 0 0.0063034 0 0.045 0 0.003 0 0.041 0 0.008 0
224 0 0.0006355 0 0.0008377 0 0.0125333 0 0.078 0 0.077 0 0.087 0 0.061 0
225 0 0.0165701 0 0.0095447 0 0.0354588 0 0.176 0 0.148 0 0.175 0 0.132 0
226 0 0.0000010 0 0.0000166 0 0.0015692 0 0.106 0 0.050 0 0.086 0 0.043 0
227 1 0.0074582 0 0.0083523 0 0.0120029 0 0.132 0 0.090 0 0.145 0 0.074 0
228 0 0.0515070 0 0.0978347 0 0.1826754 0 0.091 0 0.086 0 0.105 0 0.088 0
229 0 0.0000074 0 0.0000225 0 0.0024132 0 0.044 0 0.017 0 0.043 0 0.005 0
230 0 0.1506474 0 0.0664623 0 0.0887669 0 0.138 0 0.112 0 0.136 0 0.125 0
231 0 0.0043754 0 0.0041100 0 0.0114024 0 0.168 0 0.099 0 0.161 0 0.069 0
232 1 0.0184221 0 0.0856748 0 0.1087341 0 0.160 0 0.114 0 0.155 0 0.122 0
233 0 0.1329671 0 0.1732015 0 0.3284085 0 0.251 0 0.260 0 0.299 0 0.248 0
234 0 0.0038942 0 0.0101933 0 0.0267540 0 0.127 0 0.056 0 0.112 0 0.057 0
235 0 0.0015977 0 0.0031964 0 0.0237950 0 0.135 0 0.105 0 0.126 0 0.096 0
236 1 0.9375414 1 0.7716365 1 0.5573393 1 0.672 1 0.681 1 0.659 1 0.674 1
237 1 0.4045536 0 0.3748456 0 0.2866753 0 0.427 0 0.328 0 0.450 0 0.358 0
238 1 0.0651663 0 0.0585678 0 0.0951399 0 0.171 0 0.162 0 0.140 0 0.157 0
239 0 0.1702881 0 0.2540788 0 0.2155688 0 0.193 0 0.185 0 0.189 0 0.179 0
240 0 0.0027520 0 0.0299722 0 0.0722285 0 0.067 0 0.028 0 0.054 0 0.028 0
241 0 0.0256653 0 0.0123587 0 0.0246886 0 0.152 0 0.063 0 0.156 0 0.043 0
242 0 0.1411291 0 0.1538589 0 0.2239611 0 0.094 0 0.085 0 0.092 0 0.080 0
243 1 0.9935442 1 0.9912663 1 0.8697647 1 0.592 1 0.566 1 0.572 1 0.569 1
244 1 0.1295196 0 0.4720120 0 0.4108341 0 0.266 0 0.197 0 0.267 0 0.185 0
245 0 0.0000055 0 0.0000319 0 0.0022875 0 0.067 0 0.050 0 0.058 0 0.044 0
246 0 0.0781145 0 0.0397458 0 0.0857240 0 0.239 0 0.200 0 0.239 0 0.191 0
247 0 0.0004727 0 0.0016672 0 0.0207261 0 0.100 0 0.060 0 0.101 0 0.068 0
248 0 0.0084655 0 0.0233624 0 0.0629486 0 0.148 0 0.110 0 0.154 0 0.116 0
249 0 0.0360742 0 0.0829544 0 0.1734112 0 0.237 0 0.164 0 0.222 0 0.159 0
250 0 0.0019687 0 0.0014588 0 0.0128687 0 0.166 0 0.049 0 0.137 0 0.065 0
251 0 0.0053327 0 0.0056912 0 0.0340091 0 0.070 0 0.085 0 0.071 0 0.073 0
252 0 0.7617952 1 0.7347657 1 0.5660679 1 0.405 0 0.412 0 0.415 0 0.431 0
253 1 0.0898013 0 0.2047139 0 0.1254809 0 0.353 0 0.359 0 0.362 0 0.345 0
254 0 0.1616641 0 0.2563610 0 0.3238161 0 0.309 0 0.308 0 0.318 0 0.348 0
255 1 0.0268019 0 0.1062677 0 0.1389169 0 0.060 0 0.051 0 0.065 0 0.057 0
256 0 0.0439199 0 0.0606521 0 0.0753423 0 0.059 0 0.026 0 0.035 0 0.038 0
257 0 0.0384008 0 0.0734482 0 0.0941656 0 0.082 0 0.101 0 0.078 0 0.101 0
258 0 0.0561820 0 0.0240420 0 0.0754966 0 0.272 0 0.265 0 0.242 0 0.245 0
259 1 0.0066349 0 0.0192975 0 0.0499218 0 0.107 0 0.069 0 0.103 0 0.066 0
260 1 0.4405608 0 0.4056042 0 0.1675301 0 0.330 0 0.313 0 0.357 0 0.298 0
261 0 0.0006148 0 0.0028705 0 0.0256852 0 0.112 0 0.100 0 0.121 0 0.099 0
262 0 0.1642043 0 0.2982706 0 0.3518394 0 0.327 0 0.198 0 0.279 0 0.205 0
263 0 0.0086549 0 0.1306591 0 0.1391573 0 0.103 0 0.091 0 0.091 0 0.081 0
264 0 0.7753201 1 0.6576590 1 0.3917544 0 0.368 0 0.322 0 0.332 0 0.353 0
265 0 0.0578950 0 0.0265818 0 0.0651920 0 0.154 0 0.081 0 0.161 0 0.074 0
266 0 0.0779777 0 0.1534118 0 0.1638349 0 0.168 0 0.100 0 0.141 0 0.100 0
267 0 0.0007552 0 0.0006586 0 0.0075086 0 0.102 0 0.016 0 0.093 0 0.016 0
268 0 0.0000527 0 0.0007938 0 0.0107259 0 0.143 0 0.032 0 0.135 0 0.032 0
269 0 0.0170257 0 0.0053237 0 0.0254372 0 0.166 0 0.124 0 0.155 0 0.135 0
270 0 0.0003457 0 0.0017817 0 0.0182937 0 0.066 0 0.047 0 0.080 0 0.067 0
271 0 0.0090884 0 0.0109081 0 0.0579455 0 0.112 0 0.083 0 0.129 0 0.090 0
272 0 0.0030082 0 0.0125604 0 0.0590461 0 0.172 0 0.180 0 0.202 0 0.195 0
273 0 0.0001095 0 0.0003213 0 0.0118005 0 0.030 0 0.022 0 0.037 0 0.019 0
274 0 0.0012203 0 0.0034233 0 0.0264937 0 0.121 0 0.092 0 0.126 0 0.091 0
275 0 0.0663109 0 0.0427113 0 0.1271579 0 0.171 0 0.125 0 0.140 0 0.138 0
276 0 0.0058593 0 0.0072033 0 0.0427159 0 0.261 0 0.218 0 0.281 0 0.205 0
277 0 0.1905733 0 0.1634716 0 0.2493260 0 0.401 0 0.395 0 0.405 0 0.406 0
278 0 0.0004218 0 0.0007031 0 0.0034163 0 0.182 0 0.099 0 0.189 0 0.091 0
279 1 0.9763644 1 0.9585561 1 0.6499532 1 0.405 0 0.393 0 0.400 0 0.423 0
280 0 0.0002421 0 0.0004622 0 0.0039789 0 0.148 0 0.078 0 0.160 0 0.077 0
281 0 0.0048949 0 0.0128794 0 0.0639464 0 0.140 0 0.056 0 0.133 0 0.062 0
282 0 0.0190399 0 0.0092756 0 0.0106796 0 0.178 0 0.059 0 0.175 0 0.066 0
283 0 0.6394043 1 0.7421519 1 0.5425470 1 0.448 0 0.435 0 0.422 0 0.457 0
284 0 0.0413549 0 0.1251049 0 0.2116007 0 0.156 0 0.149 0 0.164 0 0.156 0
285 0 0.2725283 0 0.1550718 0 0.0809102 0 0.160 0 0.061 0 0.171 0 0.046 0
286 0 0.1620246 0 0.0937113 0 0.0655731 0 0.172 0 0.120 0 0.129 0 0.096 0
287 1 0.0002529 0 0.0006681 0 0.0157892 0 0.014 0 0.005 0 0.011 0 0.006 0
288 0 0.0120464 0 0.0241450 0 0.0445099 0 0.182 0 0.102 0 0.176 0 0.084 0
289 1 0.7787311 1 0.4977424 0 0.4436588 0 0.476 0 0.464 0 0.438 0 0.434 0
290 0 0.0072036 0 0.0138191 0 0.0746379 0 0.092 0 0.051 0 0.099 0 0.060 0
291 0 0.0010917 0 0.0023725 0 0.0212708 0 0.184 0 0.124 0 0.179 0 0.139 0
292 0 0.0001563 0 0.0003600 0 0.0082463 0 0.135 0 0.032 0 0.134 0 0.034 0
293 0 0.0036373 0 0.0058607 0 0.0257565 0 0.059 0 0.028 0 0.064 0 0.036 0
294 0 0.0129333 0 0.0082475 0 0.0411094 0 0.052 0 0.018 0 0.057 0 0.016 0
295 0 0.0014167 0 0.0030087 0 0.0268358 0 0.031 0 0.009 0 0.032 0 0.008 0
296 0 0.0030184 0 0.0065308 0 0.0372915 0 0.332 0 0.285 0 0.326 0 0.267 0
297 1 0.2879939 0 0.0656793 0 0.0927326 0 0.469 0 0.503 1 0.498 0 0.504 1
298 0 0.0412993 0 0.0143946 0 0.0528896 0 0.205 0 0.173 0 0.209 0 0.173 0
299 1 0.6097509 1 0.5898293 1 0.4375606 0 0.403 0 0.315 0 0.392 0 0.297 0
300 0 0.0031829 0 0.0021985 0 0.0173159 0 0.103 0 0.044 0 0.078 0 0.034 0
301 0 0.0295631 0 0.0346101 0 0.0593214 0 0.122 0 0.109 0 0.152 0 0.093 0
302 1 0.7236389 1 0.6634866 1 0.3159985 0 0.435 0 0.467 0 0.439 0 0.454 0
303 0 0.0045371 0 0.0836413 0 0.0714527 0 0.252 0 0.195 0 0.249 0 0.186 0
304 0 0.0009521 0 0.0098941 0 0.0420243 0 0.255 0 0.167 0 0.243 0 0.174 0
305 1 0.3687107 0 0.2923503 0 0.2304483 0 0.268 0 0.217 0 0.258 0 0.214 0
306 0 0.0001028 0 0.0003747 0 0.0078390 0 0.132 0 0.041 0 0.134 0 0.032 0
307 0 0.3873544 0 0.1253572 0 0.3187728 0 0.439 0 0.413 0 0.414 0 0.412 0
308 1 0.8273117 1 0.8263663 1 0.6284362 1 0.454 0 0.427 0 0.472 0 0.438 0
309 1 0.6916826 1 0.6861170 1 0.4827902 0 0.316 0 0.259 0 0.275 0 0.265 0
310 0 0.0253008 0 0.0647111 0 0.1127145 0 0.229 0 0.152 0 0.232 0 0.147 0
311 0 0.0136116 0 0.0089816 0 0.0322462 0 0.107 0 0.097 0 0.129 0 0.092 0
312 0 0.0160134 0 0.0415754 0 0.1283516 0 0.136 0 0.135 0 0.153 0 0.141 0
313 1 0.3342967 0 0.4326685 0 0.4787350 0 0.599 1 0.566 1 0.586 1 0.614 1
314 0 0.0359444 0 0.0430882 0 0.1533923 0 0.073 0 0.058 0 0.064 0 0.058 0
315 0 0.0043276 0 0.0107379 0 0.0579866 0 0.089 0 0.056 0 0.102 0 0.058 0
316 0 0.0008598 0 0.0026969 0 0.0218097 0 0.038 0 0.009 0 0.040 0 0.012 0
317 0 0.0071281 0 0.0133885 0 0.0540520 0 0.136 0 0.056 0 0.146 0 0.064 0
318 0 0.0385903 0 0.0461769 0 0.0754486 0 0.144 0 0.071 0 0.141 0 0.068 0
319 0 0.0276305 0 0.0584266 0 0.0950141 0 0.113 0 0.087 0 0.120 0 0.084 0
320 0 0.0000075 0 0.0000128 0 0.0014759 0 0.036 0 0.001 0 0.034 0 0.005 0
321 1 0.3870047 0 0.2451435 0 0.2151350 0 0.275 0 0.207 0 0.272 0 0.183 0
322 0 0.0029417 0 0.0095101 0 0.0522592 0 0.156 0 0.060 0 0.137 0 0.067 0
323 1 0.7822285 1 0.8555540 1 0.5147541 1 0.435 0 0.464 0 0.460 0 0.436 0
324 0 0.0089589 0 0.0124964 0 0.0616423 0 0.170 0 0.123 0 0.166 0 0.123 0
325 0 0.0170849 0 0.0179949 0 0.1342629 0 0.243 0 0.183 0 0.210 0 0.211 0
326 0 0.0055259 0 0.0060352 0 0.0501381 0 0.070 0 0.032 0 0.046 0 0.049 0
327 1 0.0017862 0 0.0027621 0 0.0297746 0 0.231 0 0.163 0 0.226 0 0.133 0
328 0 0.0143002 0 0.0621361 0 0.1205186 0 0.038 0 0.025 0 0.051 0 0.034 0
329 0 0.0024926 0 0.0127504 0 0.0494858 0 0.132 0 0.073 0 0.122 0 0.077 0
330 0 0.0083335 0 0.1106269 0 0.1549034 0 0.087 0 0.083 0 0.092 0 0.090 0
331 0 0.0059159 0 0.0056931 0 0.0275003 0 0.098 0 0.068 0 0.092 0 0.062 0
332 0 0.0102076 0 0.0477751 0 0.0702498 0 0.192 0 0.156 0 0.200 0 0.128 0
333 0 0.1119398 0 0.1250779 0 0.2642530 0 0.430 0 0.414 0 0.450 0 0.440 0
334 0 0.0114611 0 0.0143167 0 0.0503824 0 0.220 0 0.120 0 0.215 0 0.108 0
335 0 0.0487797 0 0.0303058 0 0.0830845 0 0.146 0 0.096 0 0.163 0 0.110 0
336 0 0.0630771 0 0.0497466 0 0.0682846 0 0.108 0 0.104 0 0.132 0 0.127 0
337 0 0.0000992 0 0.0007246 0 0.0063580 0 0.144 0 0.012 0 0.161 0 0.022 0
338 0 0.0650918 0 0.0533453 0 0.1113534 0 0.076 0 0.064 0 0.105 0 0.049 0
339 0 0.0040542 0 0.0112457 0 0.0502029 0 0.047 0 0.040 0 0.061 0 0.035 0
340 0 0.0056036 0 0.0131826 0 0.0862792 0 0.056 0 0.016 0 0.046 0 0.022 0
341 0 0.3024282 0 0.4394254 0 0.4772447 0 0.253 0 0.237 0 0.286 0 0.225 0
342 0 0.4537504 0 0.4552905 0 0.3654867 0 0.112 0 0.097 0 0.097 0 0.104 0
343 0 0.0062782 0 0.0121275 0 0.0309235 0 0.106 0 0.006 0 0.107 0 0.013 0
344 0 0.0000827 0 0.0005784 0 0.0122198 0 0.115 0 0.059 0 0.137 0 0.073 0
345 0 0.1474822 0 0.2198360 0 0.2216298 0 0.170 0 0.147 0 0.163 0 0.161 0
346 1 0.9865033 1 0.9814769 1 0.8084308 1 0.658 1 0.653 1 0.676 1 0.657 1
347 0 0.0002680 0 0.0010143 0 0.0187046 0 0.044 0 0.009 0 0.043 0 0.005 0
348 0 0.0129712 0 0.0135429 0 0.0215331 0 0.132 0 0.035 0 0.122 0 0.026 0
349 1 0.0048632 0 0.0193661 0 0.0492586 0 0.221 0 0.100 0 0.219 0 0.115 0
350 0 0.0116420 0 0.0463087 0 0.0521836 0 0.172 0 0.116 0 0.156 0 0.093 0
351 0 0.9875202 1 0.9379875 1 0.7145609 1 0.553 1 0.591 1 0.539 1 0.562 1
352 0 0.3794611 0 0.5469061 1 0.2761320 0 0.243 0 0.186 0 0.239 0 0.193 0
353 0 0.0002416 0 0.0003731 0 0.0110078 0 0.070 0 0.052 0 0.074 0 0.040 0
354 0 0.0016854 0 0.0124215 0 0.0479366 0 0.051 0 0.015 0 0.038 0 0.019 0
355 0 0.0003788 0 0.0004168 0 0.0071553 0 0.058 0 0.017 0 0.055 0 0.019 0
356 0 0.0005569 0 0.0012339 0 0.0088800 0 0.013 0 0.005 0 0.007 0 0.003 0
357 0 0.0003318 0 0.0007113 0 0.0195470 0 0.055 0 0.040 0 0.045 0 0.033 0
358 0 0.0000796 0 0.0002305 0 0.0085847 0 0.042 0 0.010 0 0.042 0 0.017 0
359 0 0.4062016 0 0.5313807 1 0.4140644 0 0.212 0 0.192 0 0.211 0 0.177 0
360 0 0.1368100 0 0.3148454 0 0.2357519 0 0.164 0 0.176 0 0.163 0 0.158 0
361 0 0.0001769 0 0.0002767 0 0.0091177 0 0.035 0 0.020 0 0.030 0 0.014 0
362 0 0.0189582 0 0.0160666 0 0.0964778 0 0.420 0 0.350 0 0.414 0 0.340 0
363 0 0.1218905 0 0.1411406 0 0.1380381 0 0.391 0 0.345 0 0.403 0 0.358 0
364 0 0.0034809 0 0.0080162 0 0.0625897 0 0.135 0 0.101 0 0.105 0 0.102 0
365 0 0.0503732 0 0.0231125 0 0.0879836 0 0.335 0 0.291 0 0.325 0 0.326 0
366 1 0.4379206 0 0.4971121 0 0.4206947 0 0.442 0 0.463 0 0.467 0 0.485 0
367 0 0.0008674 0 0.0040601 0 0.0357247 0 0.185 0 0.120 0 0.180 0 0.095 0
368 1 0.0187812 0 0.0317394 0 0.0988180 0 0.237 0 0.162 0 0.224 0 0.148 0
369 0 0.0049546 0 0.0070580 0 0.0183464 0 0.127 0 0.092 0 0.108 0 0.080 0
370 0 0.0049716 0 0.0087068 0 0.0485143 0 0.046 0 0.022 0 0.053 0 0.034 0
371 0 0.8972721 1 0.9127597 1 0.7727327 1 0.386 0 0.364 0 0.380 0 0.354 0
372 0 0.0543964 0 0.0795477 0 0.1288612 0 0.143 0 0.159 0 0.166 0 0.150 0
373 0 0.0387480 0 0.0480707 0 0.1746278 0 0.109 0 0.089 0 0.110 0 0.093 0
374 0 0.0016205 0 0.0032786 0 0.0262977 0 0.040 0 0.008 0 0.032 0 0.018 0
375 0 0.0000738 0 0.0002568 0 0.0090467 0 0.023 0 0.013 0 0.020 0 0.006 0
376 0 0.0013806 0 0.0021297 0 0.0408112 0 0.103 0 0.039 0 0.107 0 0.044 0
377 1 0.5809632 1 0.1298864 0 0.2085647 0 0.160 0 0.175 0 0.209 0 0.176 0
378 0 0.0985561 0 0.1689119 0 0.1512088 0 0.276 0 0.218 0 0.261 0 0.239 0
379 0 0.1657761 0 0.0919988 0 0.1078083 0 0.181 0 0.177 0 0.193 0 0.179 0
380 0 0.0064923 0 0.0099261 0 0.0196963 0 0.058 0 0.029 0 0.063 0 0.041 0
381 0 0.0434580 0 0.0696445 0 0.1164609 0 0.156 0 0.076 0 0.162 0 0.091 0
382 0 0.0753815 0 0.0455306 0 0.1198244 0 0.320 0 0.311 0 0.300 0 0.309 0
383 0 0.0054673 0 0.0150402 0 0.0584995 0 0.108 0 0.077 0 0.114 0 0.069 0
384 0 0.0036745 0 0.0075175 0 0.0561052 0 0.314 0 0.282 0 0.295 0 0.309 0
385 1 0.8804426 1 0.8095081 1 0.6404735 1 0.550 1 0.525 1 0.533 1 0.551 1
386 0 0.0251341 0 0.1173158 0 0.1651129 0 0.113 0 0.074 0 0.142 0 0.067 0
387 0 0.0027132 0 0.0073188 0 0.0294084 0 0.067 0 0.043 0 0.046 0 0.032 0
388 1 0.9817922 1 0.9546164 1 0.7522236 1 0.434 0 0.423 0 0.419 0 0.415 0
389 0 0.0048077 0 0.0069992 0 0.0365350 0 0.056 0 0.024 0 0.049 0 0.022 0
390 1 0.2893101 0 0.5313151 1 0.4680589 0 0.492 0 0.499 0 0.514 1 0.537 1
391 0 0.0470439 0 0.0843173 0 0.1249616 0 0.147 0 0.102 0 0.149 0 0.107 0
392 1 0.0020593 0 0.0084588 0 0.0467655 0 0.068 0 0.041 0 0.053 0 0.050 0
393 1 0.0922388 0 0.3096804 0 0.2715704 0 0.391 0 0.377 0 0.396 0 0.365 0
394 1 0.0012029 0 0.0062440 0 0.0213585 0 0.071 0 0.010 0 0.049 0 0.016 0
395 0 0.0014272 0 0.0011913 0 0.0121661 0 0.062 0 0.027 0 0.071 0 0.032 0
396 0 0.0128770 0 0.0280927 0 0.0757918 0 0.089 0 0.059 0 0.096 0 0.070 0
397 1 0.0177932 0 0.0102105 0 0.0529933 0 0.414 0 0.378 0 0.386 0 0.365 0
398 1 0.9948978 1 0.8587088 1 0.6437008 1 0.473 0 0.518 1 0.489 0 0.511 1
399 0 0.1107487 0 0.1112286 0 0.1258211 0 0.110 0 0.097 0 0.131 0 0.089 0
400 0 0.0006625 0 0.0021320 0 0.0168967 0 0.089 0 0.094 0 0.091 0 0.093 0
401 0 0.1979463 0 0.1231776 0 0.1169248 0 0.192 0 0.212 0 0.197 0 0.184 0
402 0 0.0000176 0 0.0000155 0 0.0018284 0 0.130 0 0.055 0 0.120 0 0.043 0
403 0 0.0190374 0 0.0608910 0 0.0793835 0 0.120 0 0.081 0 0.125 0 0.069 0
404 0 0.0331881 0 0.0190314 0 0.0346026 0 0.208 0 0.166 0 0.200 0 0.127 0
405 0 0.1149406 0 0.0826578 0 0.2235723 0 0.158 0 0.105 0 0.166 0 0.116 0
406 0 0.0025976 0 0.0034800 0 0.0302775 0 0.087 0 0.047 0 0.085 0 0.047 0
407 0 0.0675733 0 0.5676680 1 0.4420326 0 0.167 0 0.176 0 0.162 0 0.189 0
408 0 0.0066697 0 0.0104739 0 0.0414407 0 0.060 0 0.019 0 0.069 0 0.017 0
409 0 0.3234854 0 0.0949732 0 0.1900134 0 0.202 0 0.154 0 0.201 0 0.161 0
410 0 0.0016931 0 0.0048759 0 0.0423037 0 0.233 0 0.235 0 0.247 0 0.232 0
411 1 0.5035777 1 0.4717216 0 0.3709962 0 0.264 0 0.244 0 0.247 0 0.256 0
412 0 0.1251573 0 0.2013830 0 0.1117740 0 0.200 0 0.190 0 0.226 0 0.181 0
413 0 0.0337413 0 0.3038647 0 0.2528104 0 0.248 0 0.217 0 0.266 0 0.217 0
414 0 0.0003892 0 0.0007819 0 0.0100029 0 0.158 0 0.046 0 0.154 0 0.056 0
415 0 0.0804113 0 0.0486219 0 0.0711035 0 0.127 0 0.063 0 0.138 0 0.066 0
416 0 0.0602379 0 0.1583113 0 0.0952658 0 0.093 0 0.065 0 0.091 0 0.062 0
417 0 0.0002368 0 0.0022071 0 0.0283572 0 0.132 0 0.120 0 0.125 0 0.112 0
418 0 0.0169605 0 0.0355325 0 0.0591055 0 0.244 0 0.211 0 0.266 0 0.209 0
419 0 0.0125479 0 0.0185922 0 0.0436857 0 0.096 0 0.052 0 0.105 0 0.046 0
420 0 0.0120720 0 0.1473026 0 0.2416239 0 0.295 0 0.250 0 0.287 0 0.244 0
421 0 0.0001443 0 0.0019544 0 0.0115942 0 0.039 0 0.029 0 0.041 0 0.028 0
422 0 0.1673270 0 0.0757798 0 0.1046299 0 0.346 0 0.381 0 0.372 0 0.375 0
423 0 0.0020409 0 0.0059440 0 0.0378728 0 0.119 0 0.092 0 0.116 0 0.079 0
424 0 0.0000156 0 0.0000450 0 0.0034086 0 0.046 0 0.004 0 0.042 0 0.003 0
425 0 0.4688507 0 0.5841083 1 0.2436172 0 0.135 0 0.030 0 0.146 0 0.028 0
426 0 0.0994804 0 0.2985674 0 0.2636824 0 0.208 0 0.181 0 0.205 0 0.193 0
427 0 0.4394713 0 0.3902367 0 0.2400195 0 0.257 0 0.234 0 0.282 0 0.222 0
428 0 0.0015534 0 0.0032903 0 0.0289365 0 0.071 0 0.032 0 0.062 0 0.034 0
429 0 0.7044950 1 0.7721093 1 0.5472882 1 0.625 1 0.615 1 0.608 1 0.604 1
430 0 0.0000312 0 0.0001290 0 0.0052183 0 0.081 0 0.045 0 0.083 0 0.049 0
431 0 0.0019083 0 0.0045630 0 0.0385711 0 0.086 0 0.034 0 0.077 0 0.046 0
432 0 0.0120290 0 0.0079796 0 0.0415151 0 0.066 0 0.026 0 0.074 0 0.024 0
433 0 0.2988495 0 0.2925479 0 0.3141300 0 0.274 0 0.223 0 0.257 0 0.251 0
434 1 0.0165202 0 0.0146160 0 0.0473238 0 0.096 0 0.043 0 0.076 0 0.034 0
435 0 0.0308433 0 0.0416157 0 0.0863887 0 0.118 0 0.073 0 0.108 0 0.081 0
436 0 0.0219957 0 0.0287299 0 0.0915299 0 0.081 0 0.062 0 0.080 0 0.069 0
437 0 0.1398414 0 0.1598856 0 0.2819705 0 0.109 0 0.102 0 0.090 0 0.108 0
438 0 0.0000436 0 0.0001129 0 0.0057065 0 0.058 0 0.029 0 0.047 0 0.031 0
439 0 0.0018365 0 0.0086006 0 0.0469397 0 0.079 0 0.062 0 0.081 0 0.071 0
440 0 0.0006884 0 0.0030133 0 0.0235452 0 0.038 0 0.012 0 0.034 0 0.014 0
441 0 0.0004364 0 0.0044121 0 0.0305393 0 0.104 0 0.032 0 0.109 0 0.046 0

Differences

Column

Number misclassified

# of misclassified instances - Logistic Regr:  57
# of misclassified instances - SLR(lambda_min):  64
# of misclassified instances - SLR(lambda_1se):  61
# of misclassified instances - RF_full_w/_outliers:  61
# of misclassified instances - RF_full_w/o_outliers:  59
# of misclassified instances - RF_reduced_w/_outliers:  60
# of misclassified instances - RF_reduced_w/o_outliers:  58

Misclass Agreements (continuous)

Misclass Agreements (discrete)

Misclassified (All models agree)

  • This is a table of the test data instances that were misclassified. These specific instances were misclassified across all models applied.
# of the SAME missclassified instances that occur across ALL models:  43


index age attrition businesstravel dailyrate department distancefromhome education educationfield environmentsatisfaction gender hourlyrate jobinvolvement joblevel jobrole jobsatisfaction maritalstatus monthlyincome monthlyrate numcompaniesworked overtime percentsalaryhike performancerating relationshipsatisfaction stockoptionlevel totalworkingyears trainingtimeslastyear worklifebalance yearsatcompany yearsincurrentrole yearssincelastpromotion yearswithcurrmanager
1 37 1 travel_rarely 1373 research & development 2 2 other 4 male 92 2 1 laboratory technician 3 single 2090 2396 6 yes 15 3 2 0 7 3 3 0 0 0 0
10 24 1 travel_rarely 813 research & development 1 3 medical 2 male 61 3 1 research scientist 4 married 2293 3020 2 yes 16 3 1 1 6 2 2 2 0 2 0
33 51 1 travel_frequently 1150 research & development 8 4 life sciences 1 male 53 1 3 manufacturing director 4 single 10650 25150 2 no 15 3 4 0 18 2 3 4 2 0 3
36 32 1 travel_rarely 1033 research & development 9 3 medical 1 female 41 3 1 laboratory technician 1 single 4200 10224 7 no 22 4 1 0 10 2 4 5 4 0 4
55 38 1 travel_rarely 1180 research & development 29 1 medical 2 male 70 3 2 healthcare representative 1 married 6673 11354 7 yes 19 3 2 0 17 2 3 1 0 0 0
56 29 1 travel_rarely 121 sales 27 3 marketing 2 female 35 3 3 sales executive 4 married 7639 24525 1 no 22 4 4 3 10 3 2 10 4 1 9
57 32 1 travel_rarely 1045 sales 4 4 medical 4 male 32 1 3 sales executive 4 married 10400 25812 1 no 11 3 3 0 14 2 2 14 8 9 8
68 32 1 travel_rarely 515 research & development 1 3 life sciences 4 male 62 2 1 laboratory technician 3 single 3730 9571 0 yes 14 3 4 0 4 2 1 3 2 1 2
72 37 1 travel_frequently 504 research & development 10 3 medical 1 male 61 3 3 manufacturing director 3 divorced 10048 22573 6 no 11 3 2 2 17 5 3 1 0 0 0
79 47 1 non-travel 666 research & development 29 4 life sciences 1 male 88 3 3 manager 2 married 11849 10268 1 yes 12 3 4 1 10 2 2 10 7 9 9
131 31 1 travel_frequently 534 research & development 20 3 life sciences 1 male 66 3 3 healthcare representative 3 married 9824 22908 3 no 12 3 1 0 12 2 3 1 0 0 0
146 31 1 travel_rarely 1365 sales 13 4 medical 2 male 46 3 2 sales executive 1 divorced 4233 11512 2 no 17 3 3 0 9 2 1 3 1 1 2
168 33 1 travel_rarely 527 research & development 1 4 other 4 male 63 3 1 research scientist 4 single 2686 5207 1 yes 13 3 3 0 10 2 2 10 9 7 8
173 23 1 travel_rarely 1243 research & development 6 3 life sciences 3 male 63 4 1 laboratory technician 1 married 1601 3445 1 yes 21 4 3 2 1 2 3 0 0 0 0
177 58 1 travel_rarely 286 research & development 2 4 life sciences 4 male 31 3 5 research director 2 single 19246 25761 7 yes 12 3 4 0 40 2 3 31 15 13 8
182 55 1 travel_rarely 436 sales 2 1 medical 3 male 37 3 2 sales executive 4 single 5160 21519 4 no 16 3 3 0 12 3 2 9 7 7 3
203 41 1 travel_rarely 1085 research & development 2 4 life sciences 2 female 57 1 1 laboratory technician 4 divorced 2778 17725 4 yes 13 3 3 1 10 1 2 7 7 1 0
204 39 1 travel_rarely 1122 research & development 6 3 medical 4 male 70 3 1 laboratory technician 1 married 2404 4303 7 yes 21 4 4 0 8 2 1 2 2 2 2
220 35 1 travel_rarely 622 research & development 14 4 other 3 male 39 2 1 laboratory technician 2 divorced 3743 10074 1 yes 24 4 4 1 5 2 1 4 2 0 2
222 30 1 travel_frequently 109 research & development 5 3 medical 2 female 60 3 1 laboratory technician 2 single 2422 25725 0 no 17 3 1 0 4 3 3 3 2 1 2
227 36 1 travel_rarely 885 research & development 16 4 life sciences 3 female 43 4 1 laboratory technician 1 single 2743 8269 1 no 16 3 3 0 18 1 3 17 13 15 14
232 36 1 travel_rarely 660 research & development 15 3 other 1 male 81 3 2 laboratory technician 3 divorced 4834 7858 7 no 14 3 2 1 9 3 2 1 0 0 0
237 21 1 travel_rarely 1334 research & development 10 3 life sciences 3 female 36 2 1 laboratory technician 1 single 1416 17258 1 no 13 3 1 0 1 6 2 1 0 1 0
238 28 1 non-travel 1366 research & development 24 2 technical degree 2 male 72 2 3 healthcare representative 1 single 8722 12355 1 no 12 3 1 0 10 2 2 10 7 1 9
244 50 1 travel_frequently 959 sales 1 4 other 4 male 81 3 2 sales executive 3 single 4728 17251 3 yes 14 3 4 0 5 4 3 0 0 0 0
253 18 1 non-travel 247 research & development 8 1 medical 3 male 80 3 1 laboratory technician 3 single 1904 13556 1 no 12 3 4 0 0 0 3 0 0 0 0
255 31 1 travel_frequently 874 research & development 15 3 medical 3 male 72 3 1 laboratory technician 3 married 2610 6233 1 no 12 3 3 1 2 5 2 2 2 2 2
259 29 1 travel_rarely 408 sales 23 1 life sciences 4 female 45 2 3 sales executive 1 married 7336 11162 1 no 13 3 1 1 11 3 1 11 8 3 10
260 42 1 travel_frequently 481 sales 12 3 life sciences 3 male 44 3 4 sales executive 1 single 13758 2447 0 yes 12 3 2 0 22 2 2 21 9 13 14
287 32 1 travel_rarely 1089 research & development 7 2 life sciences 4 male 79 3 2 laboratory technician 3 married 4883 22845 1 no 18 3 1 1 10 3 3 10 4 1 1
305 49 1 travel_frequently 1475 research & development 28 2 life sciences 1 male 97 2 2 laboratory technician 1 single 4284 22710 3 no 20 4 1 0 20 2 3 4 3 1 3
321 28 1 travel_frequently 1496 sales 1 3 technical degree 1 male 92 3 1 sales representative 3 married 2909 15747 3 no 15 3 4 1 5 3 4 3 2 1 2
327 40 1 travel_rarely 676 research & development 9 4 life sciences 4 male 86 3 1 laboratory technician 1 single 2018 21831 3 no 14 3 2 0 15 3 1 5 4 1 0
349 35 1 travel_rarely 737 sales 10 3 medical 4 male 55 2 3 sales executive 1 married 10306 21530 9 no 17 3 3 0 15 3 3 13 12 6 0
351 24 0 travel_frequently 567 research & development 2 1 technical degree 1 female 32 3 1 research scientist 4 single 3760 17218 1 yes 13 3 3 0 6 2 3 6 3 1 3
366 23 1 travel_rarely 1320 research & development 8 1 medical 4 male 93 2 1 laboratory technician 3 single 3989 20586 1 yes 11 3 1 0 5 2 3 5 4 1 2
368 32 1 travel_rarely 1259 research & development 2 4 life sciences 4 male 95 3 1 laboratory technician 2 single 1393 24852 1 no 12 3 1 0 1 2 3 1 0 0 0
392 37 1 travel_rarely 370 research & development 10 4 medical 4 male 58 3 2 manufacturing director 1 single 4213 4992 1 no 15 3 2 0 10 4 1 10 3 0 8
393 26 1 travel_rarely 920 human resources 20 2 medical 4 female 69 3 1 human resources 2 married 2148 6889 0 yes 11 3 3 0 6 3 3 5 1 1 4
394 46 1 travel_rarely 261 research & development 21 2 medical 4 female 66 3 2 healthcare representative 2 married 8926 10842 4 no 22 4 4 1 13 2 4 9 7 3 7
397 31 1 travel_rarely 359 human resources 18 5 human resources 4 male 89 4 1 human resources 1 married 2956 21495 0 no 17 3 3 0 2 4 3 1 0 0 0
429 21 0 travel_rarely 501 sales 5 1 medical 3 male 58 3 1 sales representative 1 single 2380 25479 1 yes 11 3 4 0 2 6 3 2 2 1 2
434 50 1 travel_frequently 878 sales 1 4 life sciences 2 male 94 3 2 sales executive 3 divorced 6728 14255 7 no 12 3 4 2 12 3 3 6 3 0 1

Observations & Notes

Observations & Notes

  • The reduced Random Forest model on data without outliers has the best AUC and classification rate on the test set.

  • The logistic regression model built on data without outliers has the best correct classfication rate on the test set.

  • All random forest models had better AUC values than the logistic regression model, but each also had a lower correct classification rate than the logistic regression model. However, the differences were small (classification rate difference between approximately (0.009, 0.011) and AUC differences between approximately (0.0023, 0.0091)).

  • After doing some reading on-line, disagreements between AUC performance and Correct Classification Rate may occur because of having unbalanced data sets and/or having an accuracy threshold value of 0.5 (which was used in this project). To troubleshoot any of these issues, futher examination is needed of the ROC curves, threshold values used, predicted probabilities (possibly), and/or other performance measures (i.e., sensitivity, specificity, etc.). Most likely, I suspect that the issue of not having both the best AUC with the highest classification rate is related to the data set being unbalanced.

  • Upon inspecting the ROC curves, we note that the area involving the greatest disagreement between the logistic regression model and the random forest models exists where \(1-specificity\) is between \((0.25, 0.5)\).

  • Given the previous work, if the objective is predicting attrition then applying the reduced/sparse random forest model on data that does not contain outliers and/or correlated predictor variables appears to be the best method to use (of those explored here).

  • For the purpose of this project/exercise, I want to explore how various predictor variables affect the odds or probability of \(attrition = 1 (yes)\). To do so, I am choosing the logistic regression model to gain further insights from the data. This model is chosen because:

    • it’s more easly interpreted for inference purposes
    • it has the best correct classification rate
    • it’s AUC difference is is less that 0.01 compared to other models

Chosen Model

Chosen model

\[\begin{align} logit[P(attrition = 1 (Yes))] = &11.86865 - 0.36314age + 0.00426age^2 + \beta_2businesstravel \\ &- 0.00064dailyrate + 0.06069distancefromhome + \\ &\beta_7educationfield - 0.90788environmentsatisfaction \\ &- 1.19419jobinvolvement - 0.42537jobsatisfaction + \\ &\beta_{15}maritalstatus + 0.29720numcompaniesworked + \\ &\beta_{19}overtime - 0.20686totalworkingyears \\ &- 0.29303trainingtimeslastyear \\ &- 0.20462yearsincurrentrole + \\ &0.26748yearssincelastpromotion \end{align}\]

where \[\beta_2businesstravel = \begin{cases} 0, \quad for \enspace businesstravel = non-travel(1)\\ & \\ 2.44736, \quad for \enspace businesstravel = travel \enspace rarely(2)\\ & \\ 4.06929, \quad for \enspace businesstravel = travel \enspace frequently(3) \end{cases} \]


\[\beta_7educationfield = \begin{cases} 0, \quad for \enspace educationfield = human \enspace resources(1)\\ & \\ -1.76682, \quad for \enspace educationfield = life \enspace sciences(2)\\ & \\ -0.53888, \quad for \enspace educationfield = marketing(3)\\ & \\ -2.07068, \quad for \enspace educationfield = medical(4)\\ & \\ -3.11794, \quad for \enspace educationfield = other(5)\\ & \\ -0.13154, \quad for \enspace educationfield = technical \enspace degree(6)\\ \end{cases} \]


\[\beta_{15}maritalstatus = \begin{cases} 0, \quad for \enspace maritalstatus = single(1)\\ & \\ -1.60791, \quad for \enspace maritalstatus = married(2)\\ & \\ -1.68362, \quad for \enspace maritalstatus = divorced(3)\\ \end{cases} \]


\[\beta_{19}overtime = \begin{cases} 0, \quad for \enspace overtime = no(1)\\ & \\ 3.13669, \quad for \enspace overtime = yes(2)\\ \end{cases} \]

Searchable Data Table

Interpretations

Column

Continuous/numerical variable (scrollable)

  • For every one-year increase in \(age\) the estimated odds of \(attrition = Yes\) increases by a multiplicative factor of \(e^{-0.36314} = 0.6955\), meaning that there’s a 30.5% decrease in the odds of \(attrition = Yes\), holding all other variables fixed. Furthermore, since there is a quadratic (squared) term for \(age\) we see that the linear effect of \(age\) is not constant (i.e., a linear slope). Instead, we see that the slope for \(age\) changes at each additional year of age. We now see that the effect of age on the estimated odds of \(attrition = Yes\) decreases, initially, is minimized at age 43, and after age 43 the estimated odds of \(attrition = Yes\) increases, holding all other variables fixed. See the following plot

  • For every one-dollar/day increase in \(dailyrate\) the estimated odds of \(attrition = Yes\) increases by a multiplicative factor of \(e^{-0.00064} = 0.9994\), meaning that there’s a 0.0006% decrease in the odds of \(attrition = Yes\), holding all other variables fixed.

  • For every 1-mile increase in \(distancefromhome\) the estimated odds of \(attrition = Yes\) increases by a multiplicative factor of \(e^{0.06069} = 1.0626\), meaning that there’s a 6.26% increase in the odds of \(attrition = Yes\), holding all other variables fixed.

  • For every 1-unit increase in \(numcompaniesworked\) the estimated odds of \(attrition = Yes\) increases by a multiplicative factor of \(e^{0.29720} = 1.3461\), meaning that there’s a 34.6% increase in the odds of \(attrition = Yes\), holding all other variables fixed.

  • For every 1-unit increase in \(trainingtimeslastyear\) the estimated odds of \(attrition = Yes\) increases by a multiplicative factor of \(e^{-0.29303} = 0.7460\), meaning that there’s a 25.4% decrease in the odds of \(attrition = Yes\), holding all other variables fixed.

  • For every 1-year increase in \(yearsincurrentrole\) the estimated odds of \(attrition = Yes\) increases by a multiplicative factor of \(e^{-0.20462} = 0.8150\), meaning that there’s a 18.5% decrease in the odds of \(attrition = Yes\), holding all other variables fixed.

  • For every 1-year increase in \(yearssincelastpromotion\) the estimated odds of \(attrition = Yes\) increases by a multiplicative factor of \(e^{0.26748} = 1.3067\), meaning that there’s a 30.7% increase in the odds of \(attrition = Yes\), holding all other variables fixed.

Categorical variables (scrollable)

  • For every 1-unit increase in \(environmentsatisfaction\) rating the estimated odds of \(attrition = Yes\) increases by a multiplicative factor of \(e^{-0.90788} = 0.4034\), meaning that there’s a 59.7% decrease in the odds of \(attrition = Yes\) from the previous rating level, holding all other variables fixed.

  • For every 1-unit increase in \(jobinvolvement\) rating the estimated odds of \(attrition = Yes\) increases by a multiplicative factor of \(e^{-1.19419} = 0.3029\), meaning that there’s a 69.7% decrease in the odds of \(attrition = Yes\) from the previous rating level, holding all other variables fixed.

  • For every 1-unit increase in \(jobsatisfaction\) rating the estimated odds of \(attrition = Yes\) increases by a multiplicative factor of \(e^{-0.42537} = 0.6535\), meaning that there’s a 34.7% decrease in the odds of \(attrition = Yes\) from the previous rating level, holding all other variables fixed.

  • For \(businesstravel = rarely travel\), the estimated odds of \(attrition = Yes\) is \(e^{2.44736} = 11.5578\) times the estimated odds for \(businesstravel = non-travel\). The estimated odds is 1055% greater for the \(businesstravel = rarely travel\) group.

  • For \(businesstravel = travel frequently\), the estimated odds of \(attrition = Yes\) is \(e^{4.06929} = 58.5154\) times the estimated odds for \(businesstravel = non-travel\). The estimated odds is 5752% greater for the \(businesstravel = travel frequently\) group.

  • For \(businesstravel = travel frequently\), the estimated odds of \(attrition = Yes\) is \(e^{4.06929-2.44736} = 5.0629\) times the estimated odds for \(businesstravel = rarely travel\). The estimated odds is 407% greater for the \(businesstravel = travel frequently\) group.

  • For \(educationfield = life sciences\), the estimated odds of \(attrition = Yes\) is \(e^{-1.76682} = 0.1709\) times the estimated odds for \(educationfield = human resources\). The estimated odds is 82.1% lower for the \(educationfield = life sciences\) group.

  • For \(educationfield = marketing\), the estimated odds of \(attrition = Yes\) is \(e^{-0.53888} = 0.5834\) times the estimated odds for \(educationfield = human resources\). The estimated odds is 46.2% lower for the \(educationfield = marketing\) group.

  • For \(educationfield = medical\), the estimated odds of \(attrition = Yes\) is \(e^{-2.07068} = 0.1261\) times the estimated odds for \(educationfield = human resources\). The estimated odds is 87.4% lower for the \(educationfield = medical\) group.

  • For \(educationfield = other\), the estimated odds of \(attrition = Yes\) is \(e^{-3.11794} = 0.0442\) times the estimated odds for \(educationfield = human resources\). The estimated odds is 95.6% lower for the \(educationfield = other\) group.

  • For \(educationfield = technical degree\), the estimated odds of \(attrition = Yes\) is \(e^{-0.13154} = 0.8767\) times the estimated odds for \(educationfield = human resources\). The estimated odds is 12.3% lower for the \(educationfield = technical degree\) group.

  • For \(maritalstatus = married\), the estimated odds of \(attrition = Yes\) is \(e^{-1.60791} = 0.2003\) times the estimated odds for \(maritalstatus = single\). The estimated odds is 80% lower for the \(maritalstatus = married\) group.

  • For \(maritalstatus = divorced\), the estimated odds of \(attrition = Yes\) is \(e^{-1.68362} = 0.1857\) times the estimated odds for \(maritalstatus = single\). The estimated odds is 81.4% lower for the \(maritalstatus = divorced\) group.

  • For \(overtime = yes\), the estimated odds of \(attrition = Yes\) is \(e^{3.13669} = 23.0275\) times the estimated odds for \(overtime = no\). The estimated odds is 2202% greater for the \(overtime = yes\) group.

Column

Age Effect

Observations

Variables with no information value
* \(employeecount\) - only one unique value; each “1” represents a single employee
* \(over18\) - only one unique value; all employees are \(geq\) 18 years old
* \(standardhours\) - only one unique value; each employee works a standard 80-hr work week over a two-week period
* \(employeenumber\) - value represents an indexing method to identify each employee

Variables not impacting the model & its outcome
* Independence tests during logistic regression modeling indicated that the following variables had no relationship with the response variable. These variables were excluded from the model due to variable independence:
+ \(gender\)
+ \(relationshipsatisfaction\)
+ \(worklifebalance\)

Variables removed due to colinearity
* The following variables were removed during logistic regression modeling because they were highly correlated with other variables in the model:
+ \(department\)
+ \(joblevel\)
+ \(jobrole\)
+ \(montlyincome\)
+ \(yearsatcompany\)

Categorical predictor combination contributing to highest estimated \(attrition = Yes\)
* \(businesstravel = travel frequently\)
* \(educationfield = human resources\)
* \(jobinvolvement = low\)
* \(jobsatisfaction = low\)
* \(maritalstatus = single\)
* \(overtime = yes\)

Categorical predictor combination contributing to lowest estimated \(attrition = Yes\)
* \(businesstravel = never\)
* \(educationfield = other\)
* \(jobinvolvement = very high\)
* \(jobsatisfaction = very high\)
* \(maritalstatus = divorced\)
* \(overtime = no\)

Variables affecting attrition the most
* \(numcompaniesworked\) and \(yearssincelastpromotion\) each lead to an increase in the odds of attrition as the variable value increases
* \(trainingtimeslastyear\), \(yearsincurrentrole\), \(enviornmentsatisfaction\), \(jobinvolvement\), and \(jobsatisfaction\) each lead to a notable decrease as the variable value increases

Post-modeling Exploration

Attrition By Department & Gender

Findings & recommendations

Column

Major Findings

  • The highest number of employees that attrited were in the Research and Development department followed by the Sales department.

  • In decending order, \(numcompaniesworked\) and \(yearssincelastpromotion\) each, individually, have the greatest increase effect on odds of attrition

  • In decending order, \(jobinvolvement\), \(environmentsatisfaction\), \(jobsatisfaction\), \(trainingtimeslastyear\), and \(yearsincurrentrole\) each, individually, have the greatest decrease effect on odds of attrition

  • Individually, the effect of \(age\) decreases odds of attrition every year from ages 18-42. At age 43 the effect of \(age\) is minimized. Afterwards, beginning at age 44, the effect of \(age\) increases odds of attrition.

  • The odds of attrition will be substantially lower for married or divorced employees than it is for single employees.

  • The odds of attrition will be lower for each educationfield category compared to those with an education field of human resources.

  • The odds of attrition will be greater for both frequent or rare business travelers compared to non-travlers.

Column

Recommendations

  • Focus on R&D department first, Sales department second

  • Based on data exploration and model findings, consider initially focusing efforts on employees with:

    • single
    • 18-30 years old
    • < 3 companies previously worked for
    • 0-3 training times last year
    • < 4 years in current role
    • < 3 years since last promotion
    • those who work overtime
    • rare business travelers
    • life sciences, marketing, and/or medical education fields

Potential strategy

  1. Look at employee placement first.
    • Should some employees move to a different department?
    • Would they be happier? More engaged?
    • Are they currently in the department/role that is an optimal fit?
  2. Provide adequate and appropriate training
    • Employees may not feel that they are getting enough of the right training
  3. Reduce and/or re-align travel according to job role & department
    • Some employees may need to travel more to fully accomplish various tasks
    • Other employees may feel that they travel too much
  4. Address overtime
    • Can overtime be reduced?
    • Are temporary or seasonal hires needed?
    • Can overly aggressive deadlines be extended? What race is trying to be won?
    • Eliminate redundant or unnecessary job process requirements
  • Aggregated satisfaction ratings (counts) may indicate some success with implemented changes.

  • Review status quarterly or semi-annually.

Other considerations

Column

Other helpful data

  • Include attrition categories - i.e., instead of grouping by \(attrition = yes\, or \, no\), consider expanding the number of categories/reasons for attrition, such as: quit, fire, resign, retire, death, medical, relocate, etc.

  • Clarify meanings of \(dailyrate\), \(hourlyrate\), \(monthlyrate\) and/or how they related to \(monthlyincome\)

  • Should we assume that \(dailyrate\), \(hourlyrate\), \(monthlyrate\) represent an employee’s salary? If so, shouldn’t they be consitent? (ie, assume 8 hr workday - 8 * \(hourlyrate\) should equal \(dailyrate\), etc.)

  • Does \(monthlyincome\) represent employee salary before or after deducting taxes & contributions (i.e., income, Social Security, medical/vision/dental insurance, etc.)

  • Amount of overtime (i.e., number of hours of overtime worked, which day of the week overtime was worked, was overtime worked during/on a holiday and which one, etc.) may provide more information/insight better than ‘yes’ or ‘no’ responses.

  • The type and amount of training received last year may be more informative and provide better insight (i.e., online, seminar, webinar, brown-bag, formal class, class at outside formal institution [also, online, blended, traditional], etc.)

  • Exclusive of what the model indicates, compare data to relevant HR/employment requirements - is the data representative of meeting or not meeting certain state or federal employment guidelines/requirements? Diversity comes to mind. If certain requirements are not being met, then the fulfillment of those requirements could cause change in the model and what insights it leads to.

Column

Other things to try and/or explore

  • Look at a comparison of the misclassified instances from the test set vs. the instances with high leverage in the training set. Are there similarities or differences? Anything that might indicate what’s causing the misclassification?

  • GAM model - to try a smoothing, non-linear model

  • Incorporate/use SQL (via \(sqldf\) package) to compare outlier instances vs. highleverage instances found in the training set. How are they similar? How are the different?

  • Incorporate/use SQL to compare misclassified instances from the test set vs. the outlier and/or high leverage instances in the training set. How are they similar? How are the different? This could be informative about why those instances in the test set were misclassified.

  • Discretize, or group, select predictor variables, such as \(age\), \(distancefromhome\), \(dailyrate\), \(hourlyrate\), and/or \(monthlyincome\).

  • Bootstrap or randomly sample instances from the data set to add additional instances/observations to the data set to balance the response variable

    • is this an acceptable practice?
    • how would the model change or be different (at least for logistic regression)?
  • Other possible models

    • AdaBoost
    • Neural net(s) - simple, CNN, RNN, etc.
    • Survival analysis

Final thoughts